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— ■ (57) Abstract: ModiGed polypeptides containing a pilus-protein sequence and a donor compleinentary sequence are disclosed, as 
2 well as complexes fanned of a modiGed polypeptide and a pilin, or pilus-protein. Also disclosed are nielbods of using these novel 
polypeptides as a means of preventing and/or treating bacterial induced diseases, especially those caused by enterobacteria such as 
^ Escherichia coli. Methods of employing these modified pilus-derived polypeptides and complexes as vaccines and for generating 
^ antibodies for further study as wen as for clinical purposes are also disclosed herein. In addition, processes for large scale production 
^ of antibacterial vaccines containing said polypeptides and complexes are also described. 
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DONOR STRAND COMPLEMENTED 
PIUN AND ADHESIN BROAD-BASED VACCINES 

5 

This application claims the priority of U.S. Provisional Application 
60/184,442, filed 23 February 2000, and of U.S. Provisional Application 
60/144,359, filed 16 July 1999, and of U.S. Provisional Application 
10 60/143,582, filed 13 July 1999, the disclosures of all three of these 
applications being hereby incorporated by reference in their entirety. 



15 FIELD OF THE INVENTION 

The present invention relates to immunogenic polypeptides and 
complexes useful in preventing and treating bacterial diseases and to methods 
of preparing such polypeptides and complexes. 

20 

BACKGROUND OF THE INVENTION 

25 Newly formed protein chains can quickly fold in vitro to form the native 

conformation, a process often requiring no outside assistance and wherein the 
steric information for the three dimensional structure of the protein resides in 
the amino acid sequence. ATP-dependent proteins, called "chaperones," may 
aid folding of some proteins in vivo. 

30 

Conversely, there is another class of chaperones not requiring ATP for 
their functioning. These include the Pap-D-Iike periplasmtc chaperones found in 
bacteria. As used herein, and unless expressly stated otherwise, the term 
''chaperone" means exclusively this latter dass of periplasmic chaperones 
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having the further distinguishing characteristics as described below. 



In bacterial species, this latter class of chaperones is responsible for 
mediating the synthesis of large scale oligomeric structures, for example, the 
5 construction of pili, the adhesive fibers expressed in most bacteria of the 
Enterobacteriaceae family (e.g., Escherichia coli). 

Pili are heteropolymeric structures that are composed of several 
different structural proteins required for pilus assembly. Pili, also called fimbriae 
10 or fibrillar facilitate the adhesive qualities of bacteria that often lead to 
colonization and infection of various tissues of the host animal, especially on 
mucosal surfaces. Such adhesion is facilitated by the presence in the pilus of a 
protein called an "adhesin," of which RmH is an example. 

15 Different types of pili have been recognized. Type 1 pill-carrying bacteria 

recognize and bind to D-mannose in glycolipids and glycoproteins of bladder 
epithelial cells. Proteins forming the pili have been considered good candidates 
for vaccines. P pili are adhesive organelles encoded by eleven genes in the pap 
{^ilus associated with pyelonephritis) gene cluster found on the chromosome 

20 of uropathogenic strains of £1 coli. The biogenesis of P pili and Type 1 pili 
occurs via the highly conserved chaperone/usher pathway. (Thanassi et al, 
Curr Op. Microbiol. 1,223 (1.998); Hung etal, EMBO J. 15, 3792 (1994)). 

Type 1 pili are composite fibers consisting of a short thin tip fibrillum 
25 joined to a thicker, rigid pilus rod. (C.C. Brinton. Jr., Trans N.Y. Acad. ScL 27, 
1003 (1965); Jones et al, Proc. NatL Acad. Sci. USA 92, 2081 (1995)) The 
pilus fiber is an ordered array of homologous pilins (FimA, FimF, FimH, and 
FimG) with the FimH adhesin at its tip. FimH and Rnr>G have been purified as a 
complex and comprise the bulk of the tip fibrillum, which may also contain 
30 FimF. (Jones et al (1995)) The rod is comprised of FimA monomers arranged in 
a right-handed helical cylinder (Brinton (1965)) Genes important in type 1 pilus 
biogenesis (fimA-fimH) ate organized into the ffm operon (Orndorff and Faikow, 
J. Bacterioi 159, 736 (1984)) More specifically, FimH mediates binding to 
mannose-oligosaccharides present on mucosal surfaces. Thus, FimH mediates 
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adherence to mannosylated receptors on the bladder epithelium and is critical 
to the ability of uropathogenic Escherichia co/i to cause cystitis. (See: Mulvery 
et al. Science 282, 1494 (1998); Langermann et at, Science 276, 607 (1997); 
Krogfelt et al, fnfecL immun, SB, 1995 (1990); Connell et al. Proa, NatL 
5 Acad, Sci. USA 93, 9827 (1996). 

The PapD-like superfamily of periplasmic chaperones directs the 
assembly of over 30 diverse adhesive surface organelles that mediate the 
attachment of many different pathogenic bacteria to host tissues, a critical 
10 early step in the development of disease. (See Soto and Hultgren, J, Bacteriol, 
181, 1059 (1999)) PapD, the prototypical chaperone, is necessary for the 
assembly of P pili (Lindberg et at, J, BacterioL 171, 6052 (1989)) whereas its 
homotogue, called FimC, directs the assembly of type 1 pili (Jones et al. Proa. 
NatL Acad. Sci. USA 90, 8397 (1993)). 

15 

£. coli is the most common pathogen of the urinary tract, accounting for 
greater than 85% of cases of asymptomatic bacteriuria, acute cystitis and 
acute pyelonephritis, as well as greater than 60% of recurrent cystitis, and at 
least 35% of recurrent pyelonephritis infections. Because of the high 
20 incidence, continued persistence, and significant expense associated with f, 
coli urinary tract infections, there is a need for a prophylactic vaccine to reduce 
susceptibility to this disease. 

While many factors contribute to the acquisition and progression of f. 
25 coli urinary tract infections, it is widely accepted that colonization of the 
urinary epithelium is a required early step. Therefore, disruption or prevention 
of pilus-mediated attachment of E, coli to urinary epithelia may prevent or 
retard the development of urinary tract infections. 

30 For example, type 1 pili, as noted, are believed to be important in 

initiating colonization of the bladder and inducing cystitis, whereas P pili are 
thought to play a role in ascending infections and the ensuing pyelonephritis. 
Thus, pili mediate microbial attachment to the surfaces of cells, often the 
essential first step in the development of a disease, by binding to receptors 
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present in host tissues. 

However, a major disadvantage to pilus-based vaccines has been the 
fact that the major immunodominant components of pilus fibers are often 
5 highly variable antigenically and therefore afford protection against only a 
limited number of bacterial strains. In contrast, pilus associated adhesins, such 
as FimH, are relatively conserved proteins among different species and strains 
of bacteria. Thus, FimH is relatively conserved not only among uropathogenic 
strains of E. coli, but also among a wide range of gram>negative bacteria. For 
10 example, many members of the family Enterobacteriaceae produce FimH and 
vaccines incorporating the FimH antigen should exhibit a broad spectrum of 
protection. 

The major drawback to adhesin based vaccines of any kind has been 
15 the fact that adhesins are often only a minor component of the pilus, cannot 
be produced in large quantities, and therefore will tend not to elicit a 
particularly strong immunogenic effect. Although recombinant technology has 
succeeded in producing adhesin proteins in pure form, these are often rapidly 
proteolytically degraded when the corresponding chaperone is absent. Such 
20 adhesins are readily stabilized by the presence of periplasmic chaperone 
molecules (the latter also being important in proper synthesis of adhesins). 

Gram negative bacteria, of which Escherichia coli is an example, have a 
characteristic cell surface arrangement. They possess an inner plasma 

25 membrane, surrounded by a peptidoglycan cell wall, which in turn is 
surrounded by an outer membrane, the latter being highly permeable to many 
substances. Between the cell wall arui the outer membrane lies the periplasmic 
space. Proteins destined for secretion or assembly across the outer membrane 
often must fold within the periplasmic space prior to their secretion and/or 

30 assembly. Chaperones are often to be found within this periplasmic space. 
Among the proteins found in the periplasm are the adhesin FimH arxi its 
chaperone RmC. 

Throughout this disclosure the terms pilus, pili, fimbrium, fimbriae. 
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fibnilum and fibrilta are be used interchangeably, with incidental use of the 
singular or plural form of any of these terms in no way limiting the breadth of 
the disclosed invention. 

5 A "periplasmic chaperone" is defined herein as a protein localized in the 

periplasm of bacteria that is capable of forming complexes with a variety of 
proteins, especially pilus-proteins, including adhesins, especially FimH (where 
the corresponding chaperone is FimC) via recognition of a common binding 
epitope (or epitopes). Such chaperones are characterized by their similarity in 

10 properties to PapD, especially by their possession of an Immunoglobulin-IIke 
fold for binding to pilus-proteins, such as adhesins. Chaperones serve as 
templates upon which proteins exported from the bacterial cell into the 
periplasm fold into their native conformations, especially where such proteins 
are Intended to form oligomeric structures such as pllL Association of the 

15 chaperone-binding protein with the chaperone also serves to protect the 
binding proteins from degradation by proteases localized within the periplasm, 
increases their solubility in aqueous solution, and leads to their sequentially 
correct incorporation into an assembling pilus. 

20 X-ray diffraction studies have helped to reveal the different domains 

found within the structures of the vanous adhesin and chaperone proteins. 
Resolution of the crystal structure of PapD, the prototype periplasmic 
chaperone, has revealed at least two domains having the overall topology of an 
immunoglobulin fold. [Holmgren & Brandon, Nature 342, 248 (1989)] The two 

25 domains are connected by a hinge region and oriented such that a cleft is 
created between these two domains. Further, other work has suggested that 
invariant surface-exposed residues that protrude into this cleft make up the 
subunit binding pocket. [Slonim et al, (1992) EMBO J: 11, 4747-4756] Unlike 
cytoplasmic chaperones, PapD maintains target structures in native-like 

30 conformations [Lecher et al, (1989) EMBO J. 8, 2703-2709; Kuehn et al, 
(1991) Proc. Natl, Acad, ScL USA 88, 10586-10590J Such periplasmic 
chaperones have an effector function, specifically targeting the subunits to 
outer membrane assembly sites for their incorporation into pili and are 
characterized in part by the presence of an immurioglobulin-like fold. 
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Two of the genes In the pap operon - papD and papC - encode the 
chaperone and usher, respectively. Six genes encode structural pilus subunits, 
PapA, PapH, PapK, PapE, PapF, and PapG, which assemble into a 
5 heteropolymeric surface fiber with an adhesive tip (PapG). (Hultgren et al. Cell 
IZfiBl (1993). The ability of PapG to bind to galabiose receptors in the 
human kidney is a critical event in the pathogenesis of pyelonephritis. The pilus 
consists of two major sub-assemblies, a thick, rigid rod made up of repeating 
PapA subunits arranged in a right-handed helical cylinder and a thin, flexible tip 

10 fiber (the tip fibrillum) extending from the distal end of the rod and composed 
primarily of repeating PapE subunits arranged in an open-helical configuration. 
Two components of the tip fibrillum, PapK and PapF, act as adaptors. PapK is 
thought to link the pilus rod to the base of the tip fibrillum and regulates its 
length: Its incorporation terminates its growth and nucleates the formation of 

15 the pilus rod. PapF is thought to join the PapG adhesin to the distal end of the 
flexible tip fibrillum. 

Like PapD, FimC uses its immunogtobulin-like domains to recognize and 
bind to pilus subunit proteins, such as the adhesin FimH. 

20 

The co-ordinated assembly of pili, as well as of other complex hetero- 
ollgomeric organelles, requires correct incorporation of individual subunits in a 
predefined order during biogenesis and the prevention of premature 
associations between the intrinsically aggregative subunits. Type 1 pilus 

25 biogenesis proceeds via a highly conserved pathway that is involved in the 
assembly of over 30 adhesive organelles assembled by the adhesin-usher 
pathway in gram-negative bacteria. [Soto & Hultgren, J. BacterioL 181, 1059 
(1999)) The assembly machinery is comprised of two specialized classes of 
proteins, a periplasmic chaperone and an outer membrane usher. [Thanassi et 

30 al, Curr. Opin. Microbiol. 1, 223 (1998)1 Using its immunoglobulin-like folds, 
the periplasmic chaperone RmC forms periplasmic complexes with each of the 
pilus subunits prior to their incorporation into the pilus. ISoto & Hultgren 
(1999)1 Furthermore, genetk: and structural studies have shown that 
chaperones recognize a highly conserved motif present in the C-termtnal 
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portions of all subunits assembled by alt PapD-like chaperones. [Hung at at., 
EMBOJ. 15, 3792 (1996); Kuehn et al.. Science 262, 1234 (1993); Hultgren 
et at., in Molecular Biology of Chaperones and Folding Catafysts: Regulation, 
Cellular Functions and Mechanisms, B. Bakau, Ed. (Harwood Academic 
5 Publishers, Amsterdam, 1999), p. 6611 

The chaperone activity of FimC has been demonstrated and FimC has 
been shown to bind FimH, the adhesin of type 1 piti, to form periptasmic 
preassembly complexes. Thus, a FimH-FimC complex has been isolated using 
10 mannose-Sepharose chromatography of periplasm and then eluting with D- 
mannose. [see Jones et al (1993)) FlmH is folded in the FimH-FimC complex in 
such a way that the mannose binding domain is in a native state and 
accessible for substrate binding. In addition, the amino acid sequence of FimC 
is known, (see Jones et al (1993)) 

15 

The crystal structure of PapD, the prototypical periplasmic chaperone, 
has been solved (Holmgren and Branden, Nature 342,248 (1989) and refined 
to 2.0 A resolution, revealing a molecule with two immunoglobulin-like 
domains oriented in an L shape to form a deft at their surface. Ail 30 + 

20 members of the periplasmic chaperone superfamtly have a conserved 
hydrophobic core that maintains the overall features of the two domains. 
During pilus-biogenests, PapD binds to and caps interactive surfaces on pilus 
subunits and prevents their premature aggregation in the periplasm. A 
combination of genetic, biochemical and crystallographic data has 

25 demonstrated that the G1 p-strand of PapD forms a ^zipper interaction with 
the highly conserved COOH-terminal motif of pitus subunits. The COOH- 
terminal motif also comprises at least part of a primary surface for subunit- 
subunit assembly interactions, indicating that the direct capping of a primary 
assembly surface is part of the molecular basis by which periplasmic 

30 chaperones prevent the premature oligomerization of pilus subunits. In 
addition, the ^zipper interaction has been proposed to facilitate the folding of 
the subunit into a native-like conformation via a template-mediated mechanism. 
(Soto et al, EMBO J. 17: 61 55 (1 998). 
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While the utility of adhesins as vaccines has been demonstrated, large 
scale production of adhesins and other pilus-derived proteins has been 
complicated by the requirement of a chaperone that must be co-expressed 
with the adhesin in order for it to properly fold and resuK in a stable structure. 
5 It would therefore be highly advantageous if adhesins could be produced in 
pure form without the need of co-expressing the chaperone, and without the 
need for the chaperone, or any other protein, at all, thereby permitting large 
scale production of pure adhesins, or any other pilus subuntts, for use, inter 
alia, as vaccines. 

10 

BRIEF SUMMARY OF THE INVENTION 

The present invention relates to immunogenic complexes and 
15 polypeptides, comprising a pilus protein component or portion, and a donor 
strand component, or portion, wherein the pilus protein and donor strand may 
or may not be covalently bonded together. 

It is an object of the present invention to provide polypeptides 
20 containing an amino acid sequence derived from a -bacterial pilus-protein, 
including adhesins, such as FimH or a pilin such as PapK, and a portion of 
another protein acting as a donor complement, especially where the latter Is a 
chaperone, such as FimC or PapD, or a portion of another subunit, such as 
FimG or RmF. 

25 

In specific embodiments, the present invention provides a polypeptide 
comprising a pilin protein, such as FimH, attached to a donor strand, such as a 
strand composed of the first 13 residues of FimG, separated by a short 
intervening sequence, such as a tetrapeptide, to form a single chain 
30 immunogenic polypeptide. Other specific embodiments include a similar 
structure employing PapK with a sequence of PapD attached to the C-terminus 
of PapK via a short amino acid linking sequence to form a single chain 
polypeptides. Of course, the invention also encompasses such structures 
wherein the pilin and donor strand segments are non-covalently linked to each 
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Other so long as the correct conformation of the pilin is maintained. 

In a highly preferred embodiment, polypeptides, or complexes thereof, 
of the present invention comprise a pilus subunit and a donor strand derived 
5 from the N-terminal extension of another pilus subunit. 

It is still a further object of the present invention to provide vaccines 
containing the modified pilus polypeptides as a means of combating diseases, 
especially preventing diseases by vaccination, wherein such diseases are 
10 caused by bacterial species from which the adhesin structure is derived, 
especially E. co/L 

It is an additional object of the present invention to provide for 
antibodies specific for the modified pllus-protein structures disclosed herein for 
15 use in treating diseases caused by bacteria from which the adhesin or pilin 
structure was derived, especially E. co/L 

It is yet a further object of the present invention to provide a method for 
synthesizing modified pilus-proteins, in large scale, either in a free state or in a 
20 complex with another pilin protein, for use as vaccines, without the need of 
producing corresponding proteins, such as chaperones. Such methodology will 
thereby facilitate the production of such pilus-based (pilin-based and adhesin- 
based) vaccines as a single polypeptide chain, thus reducing time, cost and 
complexity of production and facilitate large scale vaccine production. 

25 

BRIEF DESCRIPTION OF THE DRAWINGS 

30 Figure 1 shows a comparison of the amino acid sequences of the 

various Fim proteins that compose the pill of bacteria such as 5. co//. The 
sequences of FimH, FimA, FimF and FimG are all aligned to show 
corresponding sequences and domains. The end of the mannose-binding lectin 
domain and the start of the pilin domain in FimH are indicated by vertical 
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arrowheads above the sequences. The type 1 pilin subunits (FimA, FimF and 
FimG) are aligned with the pilin domain of FimH using clustal W (see: 
Thompson et al. Nucleic Acids Res. 22, 4673 (1994)1 and manually adjusted 
to minimize gaps in secondary structure elements. Gaps in the alignment are 
5 indicated by dots. Sequence numbering for FimH starts at position 22 in the 
pre-protein. Pilus subunits (including FimH) are expressed in the cytoplasm as 
pre-proteins with an amino terminal signal sequence that is cleaved during 
transport across the inner membrane. The first residue in RmH that is visible in 
our maps corresponds to Phe^^ in the gene-derived sequence, which is the 

10 expected start of the FimH chain. To distinguish residues in the adhesin protein 
from residues in the chaperone, FimH residues will be denoted with an "H" and 
FimC residues with a "C" after the residue number. Residues involved in 
chaperone binding are indicated by an open circle above the residue. Residues 
in the carbohydrate binding pocket are boxed, with a large box marking the 

15 NHj-terminal extensions in the pilin subunits. The conserved p-zipper motif 
found in all pilin subunits corresponds to the F ^-strand. Limits and 
nomenclature for secondary structure elements are shown below the 
sequence. 

20 ~ Figure 2 shows the sequence of FimC. Residues involved in subunit 
binding are indicated by an open circle above the residue. Residues that are 
identical or conserved in all periplasmic chaperones are set against a darker 
background. Limits and nomenclature for secondary structure elements are 
shown below the residues. 

25 

Figure 3 shows ^sheet topology diagrams of the mannose binding 
domain (Fig. 3A) and the pilin domain (Fig. 3B) of FimH. The F strand is at the 
C-terminal end of the pilin domain and thus would appear at the C-terminus of 
the FimH molecule. Each strand depicted is readily correlated with the 
30 corresponding sequences of Figure 1 wherein the strand designation appears 
below the arrow indicating the residues making up that strand. 

Figure 4 shows a molscrlpt (P.J. Kraulis, J. AppL Cryst, 24, 946 
(1991) ribbon diagram of the FimC-FimH complex, with FimH vertically and 
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FimC oriented horizontally, as depicted by the arrows {with the Gl strand 
indicated). The upper right shows a ball-and-stick representation of the 
C^HEGA molecule bound to the lectin domain of RmH and indicates the 
position of the carbohydrate-binding site at the tip of the domain. 

5 

Figure 5 shows a sequence alignment of P-pllus subunits (PapA, PapK, 
PapE, and PapF). The secondary structural elements of PapK are indicated 
above the aligned sequences, with p-strands and helices (including both a and 
3,o) indicated. Residue numbers of PapK are indicated above the PapK 
10 sequence. Residues involved in contact with domains 1 and 2 of PapD are 
boxed. Residues strictly conserved among pilins are shown as the vertical 
quartets of G, C and Y residues (using standard one letter code) under residue 
numbers 7, 14, 54, 144 and 156. 

15 Figure 6 shows the sequence and secondary structure definition of 

PapD. Residue numbers are indicated above the sequence, while secondary 
structural elements are indicated below it. Residues that contact PapK are 
boxed. 

20 Figure 7 shows the topology of PapK with the nomenclature for 

secondary structures as described in Jones, Cum Op. Struct Biol. 3, 846 
(1993). The p-strands are indicated as arrows, while helices are shown as 
cylinders. The inserting p-strand of PapD is indicated in the figure as Gl . 

25 Figure 8 shows the structure of the PapD-PapK complex arnl definition 

of secondary structure notation. Here, the molecular surface of PapK was 
calculated and displayed using GRASP INIchols et al. Proteins: Struct Funct 
Genet 11, 281 (1991)1 The structure of PapD is shown as a ribbon. The 
insertion of the Gl p-strand of the PapD into a groove on the surface of PapK 

30 IS shown. 

Figure 9 shows the general arrangement for one embodiment of the 
present invention wherein a FimH has been linked to residues 1-13 of FimG 
using a tetrapeptide (DNKQ - using standard one letter amino acid code) linker 
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to form a donor strand complemented FimH (dscFlmH). The construction of 
this embodiment is described in detail in Example 1 . 

Figure 10 shows the results of an experiment in which C3H/HeJ mice 
5 were immunized with pilus-protein plus CFA/IFA. Here, a 4 week boost was 
provided IM (intramuscularly) and then the animals challenged at 9 weeks 
intraurethally with a dose of 11.3x10^ cfu (colony forming units) of E. col/ 
strain NU 14. 

10 Figure 1 1 shows the results of an experiment in which C3H/HeJ mice 

were immunized with pilus-proteins plus MF59 with 4 and 16 week boosts IM 
(intramuscularly) with the indicated immunogens with endpoint titers shown at 
the left. 

15 Figure 12 shows the results of an experiment in which C3H/HeJ mice 

were immunized with the indicated pilus-protein plus MF 59. Here, a 4 and 1 6 
week boost was provided IM (intramuscularly) and then the animals challenged 
at 19 weeks intraurethally with a dose of 7x10' cfu (colony forming units) of 
£. CO// strain NU 14. 

20 



DETAILED DESCRIPTION OF THE INVENTION 

25 The pr^ent invention is directed to polypeptides, especially 

immunogenic polypeptides, formed from a pilus-protein, such as, for example, 
an adhesin, and erther a chaperone fragment or a pilin fragment, as well as 
dimeric complexes thereof, wherein the components of said complexes may or 
may not be covalently bound to each other. Such polypeptides, and complexes 

30 thereof, have the advantage of being synthesized in pure form and in a fully 
functional state without the need of accessory proteins, although the latter 
may be used to form specific complexes within the invention. 

As used herein, the term "pilus-protein" means any protein or 
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polypeptide present in a pilus, especially a type 1 pilus or a P pilus. This term 
encompasses all subunKs of a pilus but excludes chaperones because, while 
they are integral in formation of a pilus, they are not incorporated into this 
organelle and thus are not subunits of it. Also as used herein, the term 
5 -adhesin" means a protein, especially a subunit of a pilus, with specific 
receptor binding properties, for example, FimH that has a lectin-binding domain 
for specifically binding mannose-bearing sites on cell surfaces, especially the 
surfaces of mucosal cells, most especially bladder cell of a mammal. 

10 During pilus biogenesis, the chaperone, either FimC or PapD, binds to 

and forms stable complexes with individual pilus subunits. The chaperones 
consist of two immunoglobulin-like (Ig-like) domains oriented toward each 
other to form L-shaped molecules. [Holmgren and Branden, Nature 342, 248 
(1989); Pellucchia et at. Nature Struct. Biol. 5, 885 (1998)] The FimH adhesin 

15 has both a pilin domain and a receptor-binding domain. The PapK pilin and the 
pilin domain of FimH have Ig-Iike folds but lack the seventh C-terminal p-strand 
(strand G of Figure 3) present in canonical Ig-folds. The absence of this strand 
produces a deep groove along the surface of the pilin domain and exposes the 
hydrophobic core, thereby accounting for the instability of pilins when 

20 expressed without the chaperone (see examples hereinbelow). Thus, in the 
chaperone-subunit complex, the G1 strand of the chaperone completes an 
atypical Ig fold of the subunit by occupying the groove and runnig parallel to 
the subunit C-terminal F strand. In accordance with the present invention, this 
"donor strand complementation* interaction simultaneously stabilizes pilus 

25 subunits and caps their Interactive surfaces, preventing their premature 
oligomerization in the periplasm. Also in accordance with the present 
invention, during pilus biogenesis, the highly conserved N-terminal extension of 
one subunit displaces the chaperone G1 strand from a neighboring subunit by 
a process referred to herein as "donor strand exchange.* The N-terminal 

30 strand then inserts anti-parallel to the F-strand of the neighboring subunit to 
afford a mature pilus comprising an array of perfectly canonical Ig domains, 
each of which contributes a strand to the fold of its neighboring subunit. 



Thus, further in accordance with the invention disclosed herein, the 
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contribution of a chaperone^ such as FimC or PapD, to the overall structure of 
a pilin, such as in the FimC-FimH complex, or in the PapD-PapK complex, was 
determined by solving the structure of such complexes by X-ray diffraction 
(seer Choudhury et al. X-ray Structure of the FimC-FimH Chaperone-Adhesin 
5 Complex from Uropathogenic £ coli, Science 285, 1061 (1999); Sauer et al. 
Structural Basis of Chaperone Function and Pitus Biogenesis, Science 285, 
1058 (1999); Barnhart et a!., PapD-like Chaperones Provide the Missing 
Information for Folding of Pilrn Proteins, Proc. NatL Acad. Sci. USA, 10, 
1073/pnas. 1301 83897 {published online June 20, 2000), the disclosures of 

10 all of which references are hereby incorporated by reference in their entirety 1. 
For the FimC-FimH complex, the crystal structure was solved using MAD data 
to 2.7 A collected from selenomethionyl FimC-FimH crystals. The crystals used 
for the structure determination belong to the symmetry group C2 with cell 
dimensions a= 139.08 A, b= 139.08 K c = 21 4.49 A, and p= 89.97 A. The 

15 crystal structure revealed the presence of a pocket in the otherwise flat 
surface of the lectin domain. This pocket is large enough to accommodate a 
single mannose unit and is located at the tip of the domain, distal to the 
connection with the pilin domain. For RmH, the bottom of the pocket is 
defined by the N-terminus of the RmH molecule and is lined with typical 

20 carbohydrate binding side chains from Asn, Gin, and Asp residues in 3 loop 
regions. 

This analysis showed that the FimH was folded into 2 domains of the 
all-beta class. These domains are aligned end to end so that the FimH molecule 

25 spans a length of over 100 A. The amino terminal domain of the adhesin, 
FimH, contains the mannose binding domain, used to bind to the surface of a 
mucosal cell, and the C-terminal end forms the pilin domain, which is used to 
anchor the adhesin to the pilus. The NHj-terminal mannose binding domain 
comprises residues 1H-156H while the C-terminal pilin domain, which is used 

.30 to anchor the adhesin to the pilus, comprises residues 160H-279H {see Figure 
1). In addition, a short extended linker (residues 157H-159H) connects the two 
domains. 



As already stated, FimC has 2 immunoglobulin-like domains oriented at 
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about 90** to each other and with a deep cleft between the two domains. The 
pilin domain of RmH binds in the cleft of the chaperone but, with the seventh 
fold missing (see Figure 3), the hydrophobic core is exposed and the terminal 
portions of the pilin domain lie next to each other instead of at opposite ends 
5 of the pilin domain. 



Thus, the pilin domain of FimH has the same topology as the NH2 - 
terminal domain of a number of chaperones but with the critical difference that 
the seventh strand of the fold is missing. A similar situation would occur in 

10 other piius-proteins, such as FimG and FimF. In the FimH-FimC complex, the 
G1 strand (i.e., the seventh strand) of the chaperone is used to complement 
the pilin domain by being inserted between the second half of the A strand and 
the F strand of the pilin. Thus, the C-terminal, or F, strand of FimH (Figs. 1 
and 3) forms a parallel beta pleated strand interaction with the G1 p-strand of 

15 FimC (Fig. 2) and has its COOH-terminal carboxyl group anchored in the 
crevice of the chaperone cleft through hydrogen bonds with the conserved 
residues Arg®^ and Lys"^*^ in FimC (superscripted numbers and letters refer to 
the residue number and to the identity of the protein involved - for example, 
"C" for FimC and "H" for FimH sequences). The residues involved in the FimH- 

20 FimC interactions are indicated In Figures 1 and 2 where the amino acid 
sequences of these proteins are shown. 

The mechanism by which this process occurs, wherein the chaperone 
binds to the adhesin, here FimH, to direct proper folding of the adhesin is 

25 referred to herein as "donor strand complementation" and the strand that 
complements the fold or structure of the adhesin , such as the G1 p-strand of 
the chaperone FimC, is referred to as the "donor complement" segment or the 
"donor strand." More specifically, the Gl p-zipper strand of periplasmic 
chaperones contains a conserved motif of solvent exposed hydrophobic 

30 residues at positions 103, 105, and 107 in FimC (see Figure 2). In the FimC- 
FimH complex, these residues are thereby used to complete the unfinished 
hydrophobic core of FimH. In short, the pilin domain is an incomplete protein 
domain to which the chaperone donates its Gl strand so as to complete it. As 
a result, pilus-proteins, or pilins, as a class, are missing necessary steric 
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information needed for stability as wett as for folding into the native 3 
dimensional conformation and so require the presence of the chaperone, FimC 
in the case of FimH, in order to provide the necessary complementary, and 
compensating* structure. 

5 

The G1 strand of the chaperone simply inserts itself into the groove of 
FimH (the F and A2 line groove) so that engineering the same sequence at the 
C-terminal end of the pilin domain of FimH riesults in a FimH having the same 
structure (i.e., same overall shape) as if the chaperone were present. 
10 Alternatively, the N-terminal sequence of FimG may be employed since it is 
better for forming a completely canonical fold. In the same way, a donor 
strand can be engineered onto PapK by modeling the appropriate strarnl from 
the chaperone PapD or a strand from another suitable pilus subunit structure. 

15 Because this donor complemented structure of FimH, or any other 

adhesin, is necessary to the overall shape of the molecule and thus to the 
ability to stimulate the immune response, it is virtually impossible, in the 
absence of the chaperone, to generate such adhesins in a pure form to be used 
as immunogens, either by recombinant means or otherwise, and, thus, as a 

20 vaccine for prevention and treatment of infections. 

In accordance with the present invention, the sequences missing from 
the structure of the putative-pilus protein immunogen are supplemented with 
appropriate donor strand sequences to take the place of the missing chaperone 

25 sequence. As disclosed herein, examples of such sequences and structures are 
provided. Also provided are rules providing guidance in facilitating the 
determination of appropriate sequences to use for donor complementing a 
wide variety of bacterial pilus proteins, especially where these are to be used 
in the development of compositions for preventing and/or treating bacterial 

30 diseases. Further, it is advantageous to supply these missing structures by 
recombinant means and thereby give rise to a fully active (I.e., immunogenic) 
adhesin molecule or complex. 

FimA, FimF, and FimG all have a highly conserved extension of about 
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10-20 amino acids at the N-terminus compared to the FimH pilin domain. In 
the PapD-PapK structure, the PapK amino terminal extension is disordered and 
thus the first p-strand begins after the first cysteine, just as in the FimH pilin 
domain. In accordance with one embodiment of the present invention, an 
5 amino terminal extension of one subunit is used as a donor strand to provide 
the missing seventh strand in the neighboring subunit. Use of such neighboring 
subunits produces a complete canonical fold whereas the chaperone itself 
would complete an atypical fold. 



10 Also in accordance with the present invention, crystallographic 

procedures likewise show that PapK has the same overall variable-region tg-like 
fold as the amino terminal domain of PapD, with 2 p-sheets coming together in 
a p-sandwich. However, the Ig fold of PapK is incomplete in that it lacks the 
COOH-terminal seventh strand, G, which in canonical Ig folds forms an 

15 antiparallel B-sheet interaction with strand F and contributes to the 
hydrophobic core of the protein. It has now been found that in the PapD-PapK 
complex, this missing strand is provided by PapD, which donates its Gl p- 
strand to complete the Ig fold of PapK, in analogous fashion as was found for 
the FtmH-FimC complex just described. However, the Ig fold produced thereby 

20 is atypical, since the donated strand runs parallel, rather than antiparallel, to 
strand F in PapK. The insertion of the Gt p-strand into the fold of the pilin, i.e., 
donor strand complementation, is similar to that observed in the crystal 
structure of the FimH-FimC complex. 

25 The first 8 amino acid residues of PapK are disordered and the Ig fold of 

PapK begins with a short p-strand, A1 (see Figure 7>, that makes typical antir 
parallel hydrogen bonds with the COOH-termlnal residues of strand B. This 
short p-sheet arrangement is interrupted by the insertion of the 3io helical turn 
that results in strand A switching sides in the p-sandwich in order to make 

30 antiparallel B-strand interactions with the Gl strand of the chaperone. Strands 
A and B are connected by a short a-helix and it is strand B that forms the edge 
of one of the two p-sheets in the p-sarwJwich running antiparallel to strand E. 
Following strand B, the structure crosses over to the other side of the p- 
sandwich through a short 310 helix to form strand CI, which then runs 
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antiparaliel to strand F. 

The total buried surface area in the PapD-PapK complex is 3434A^ and 
there are two distinct sites on PapK that interact with two corresponding sites 
5 on PapD. Site K1 of PapK (see Figure 7) interacts with domain 1 of PapD (site 
D1 of Figure 6) and K1 contains a deep groove that runs the length of the 
subunit. The edges of this groove consist of strands A and F and its base is 
formed by the hydrophobic core of PapK. This groove is the result of the 
missing G p-strand in the Ig fold of PapK. Residues 101 to 112 (in site D1) of 

10 the G1 p-strand in the Ig fold of PapD (see Figure 6) insert into the Kl groove 
and make a p-zipper interaction with strand F of PapK (Figure 7) on one side of 
the groove. Residues 101 to 105 also malce a p-zipper interaction with strand 
A2 on the other side of the groove. Insertion of the G1 p-strand also results in 
the formation of a continuous 5-stranded p-sheet which includes strands CI, 

15 Fl, and G1 of PapD and the F and CI strands of PapK. Alternating 
hydrophobic residues in the Gl p-strand of PapD interact with the hydrophobic 
base of the groove. Thus, donor strand complementation by the Gl p-strand of 
PapD shields the hydrophobic core of the pilin from exposure to the aqueous 
milieu of the periplasm. In addition, site K2 of PapK interacts with a site on the 

20 COOH-terminal domain (domain 2) of PapD. Thus, in the PapD-PapK complex 
structure, strand F of PapK forms one side of the groove into which the Gl p- 
strand of the chaperone inserts and is likely to assume the same structural role 
in pilins. 

25 Genetic, biochemical and elecuon microscopic studies have shown that 

residues in two conserved motifs (the C-terminal F strand and an N-terminal 
motif) participate in subunit-subunit interactions necessary for pilus assembly. 
[See: Hung et al., EMBOJ. 15, 3792 (1996); Kuehn et al.. Science 262, 1234 
(1993); Hultgren et al., in Molecular Biology of Chaperones and Folding 

30 Catalysts: Regulation, Cellular Functions and Mechanisms, B. Bakau, Ed. 
(Harwood Academic Publishers, Amsterdam, 1999), p. 6611 An alignment of 
the pilin sequences (see Figure 1), based on the FimC-FimH crystal structure, 
revealed that the N-terminal motif was part of a 10-20 amino acid extension 
that was missing in the FimH pilin domain and disordered in the PapD-PapK 
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complex (a similar alignment for Pap-proteins is shown in Figure 4). This region 
contains a pattern of alternating hydrophobic residues similar to the G1 donor 
strand of the chaperone. As demonstrated by molecular modeling, the N- 
terminal extension of such a subunit takes the place of the G1 strand of the 
5 chaperone by fitting into the pilin groove. Thus, during pilus assembly, 
alternating hydrophobic side chairts in the N-terminal extension can replace the 
hydrophobic side chains donated to the pilin core by the G1 strand of the 
chaperone, via the donor strand exchange mechanism of the present invention. 
The net result is that every subunit completes the immunoglobulin-like fold of 
10 its neighboring subunit. 

The NHj-terminal portion of pilins, corresponding to the disordered NH2- 
terminus of PapK, forms an assembly surface on the pilin. The 8 NHj-terminal 
residues are disordered in the Pap-E>PapK complex and protrude away from 

15 the main body of the structure, where they are free to interact with the groove 
of the preceding subunit located at the usher. In accordance with the present 
invention, therefor, the NH2-terminus of an incoming subunit inserts into the 
groove of the preceding subunit, displacing the G1 p-strand of the chaperone 
(which process is facilitated by the usher). Such "donor strand exchange" 

20 implies that in the pilus, the NH2-terminal strand of one subunit completes the 
immunoglobutin-like fold and thereby protects the hydrophobic core of the 
preceding subunit, much as the chaperone does in the periplasm. 

The present invention is thus directed to pilus-proteins, including 
25 adhesins, such as HmH, or non-adhesins such as RmG and FimA, in which the 
missing fragment normally supplied by a chaperone, such as FimC, or a pilin 
via the donor strand exchange mechanism during pilus assembly, has been 
added to its amino acid sequence derived from that chaperone or pilin. In 
accordance with the invention, this result is accomplished by engineering a G1 
30 p-strand of FimC (SEQ ID NO: 3 and 4), or an N-terminal extension strand of 
FimG (SEQ ID NO: 5 and 6) or FimF (SEQ ID NO: 7 and 8), or other similar 
sequence of related pilins, and in either a forward or reverse (i.e., inverted) 
sequence orientation onto the COOH-terminus of a pilus-protein, such as an 
adhesin, thereby removing the requirement of a chaperone and/or donor strarKi 
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comptementation. For example, where the pilus-protein is an adhestn, such as 
FtmH, the addition of the C-terminal p-strand to the adhesin completes the 
immunoglobulin fold and thus removes the need for any type of accessory 
protein, such as a chaperone. Thus, by donating a secondary stmctural 
5 element to the fold of the pilin, for example, in the PapD-PapK complex, the 
chaperone not only contributes to the stability of the pilin but also prevents 
other pilins in the periplasm from binding to the groove of the chaperone-bound 
subunit. It should be noted that for P pili, PapG is the adhesin. 

10 Due to topological requirements and constraints, and depending on the 

protein being engineered and the p-strand being "donated," the donor 
sequence can be engineered in a forward or reverse (i.e., inverted) amino acid 
sequence orientation with respect to the pilus-protein sequence being 
"complemented." However, because this donor strand must fold around and 

15 loop back onto the pilus strarKi in an anti-paraliel orientation, such donor 
complemented structures can be engineered with the donor strand in a forward 
or reverse orientation. Whether enginered in a forward or reverse direction 
depends on the source of the sequence used to prepare the strand, for 
example, whether the donor strand is a match from the G1 strand of the 

20 corresponding chaperone or is modeled on the N-terminal strand of a 
neighboring subunit, such as where the N-terminal sequence of FimG is used 
to complement FimH. If engineered in a forward orientation, the N-terminal 
residue of the donor straruJ is bonded to the C-terminal residue of the pilus 
protein (e.g., FimH), while in the reverse orientation, the C-terminal residue of 

25 the donated strand is bonded to the C-terminal residue of the adhesin. 
Depending on the type of structural arrangement achieved. It may be 
necessary to also insert a linker sequence (perhaps as short as 4 residues or a 
structure of similar length and conformation) between the pilus-protein 
sequence and the donor strand sequence in order to afford sufficient flexibility 

30 for appropriate conformational folding necessary to attain the proper 3 
dimensional structure (and such was done in carrying out embodiments of the 
present invention). Such engineering is readily accomplished either by direct 
chemical synthesis of the desired adhesin or by using a polynucleotide 
sequence in which the order of the codons encoding the donated strand 
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sequence have the desired orientation. Such donated strand sequence may, of 
course, be derived from the G1 strand of FimC or PapD, or the N-terminal 
extension strarKi of a pitus subunit, such as FimG. It should be mentioned that 
because the chaperone is not incorporated into the assembled pilus, such 
5 chaperor^s are not "pilus-proteins." 

The ability to produce a fully formed and functional pilin protein without 
the need to provide for the chaperone, either by addition or as a co-expressed 
protein, is highly advantageous in that it permits large scale production of a 
10 single polypeptide chain, thereby reducing the time, cost and complexity of 
such production. 

In furtherance of this objective, the present invention is directed to 
modified pilus-polypeptides, i.e., polypeptides or proteins derived from pili, 

15 especially type 1 pili and P pili, and forming subunits thereof. Including 
adhestns such as FimH, and pilins such as PapK, comprising a pitus-protein 
portion and a donor complement portion all within the same amino acid 
sequence. Thus, with respect to a polypeptide within the invention the 
distinction between a donor complement portion and an adhesin or pilin portion 

20 is merely for convenience and in fact only a single polypeptide is contemplated 
herein, although such donor complement fragment may certainly be chemically 
attached to the adhesin portion by some chemical linkage other than a 
conventional peptide bond. Such modified proteins include FimH arnJ PapK 
following donor strand complementation. Thus, donor strand complemented 

25 HmH would be denoted dscFimH (where dsc stands for "donor strand 
complemented"). In an attemative embodiment of the present invention, the 
donor strand need not be covalently attached at all but can merely be bonded 
non-covalently as part of a complex the overall shape of which is the native 
shape of the adhesin or pilin protein, such as FimH. 

30 

The problem of producing pure pilins or adhesins, free of a chaperone 
complex, and available for use as a vaccine, is solved by the present invention 
by providing any desired pilus protein (a term that does not include 
chaperones) from any bacterial structure assembled by the chaperone-usher 
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pathway (see Hung and Hultgren, J. Struct Biol. 124, 201 - 220 (1998) for a 
review of this pathway) whose fold is complemented by a peptide, or even a 
non-peptide moiety, to form one structure or complex that is readily prepared 
in pure form and without the need for accessory proteins, such as chaperones. 

5 

In accordance with the disclosure herein, the present invention relates 
to an immunogenic complex comprising a pilus protein component and a donor 
strand component, wherein said pilus protein component may or may not be 
covalently bound to the donor strand component. Thus, the pilus and donor 

10 components may be held together by non-covalent interactions, including, but 
not limited to, electrostatic interactions, hydrophobic Interactions, including 
van der Waals forces and general entropic forces, London forces, and the like. 
Such pilus protein component may or may not be an adhesin. Where it is an 
adhesin, especially preferred adhesins are RmH and PapG. Other pilus proteins 

15 are those selected from the group consisting of RmF and FimA (both from type 
1 pill) and PapK, PapA, PapF, PapE, and PapH (all from type P plii). Donor 
strand components comprise an amino acid, or other polymeric strand, that is 
able to substitute for the missing strand of the pilus protein. Such substitution 
is determined by resort to the rules disclosed herein for utilizing the disclosed 

20 strands from PapD, FtmC and FimG to replace missing strands of PapK and 
RmH. 

The present invention also relates to polypeptides, especially 
immunogenic polypeptides, comprising a pilus protein portion and a donor 

25 complement portion, both of which may be part of the same protein complex. 
The donor complement portion may be a bacterial pilin or adhesin, a portion 
thereof, or may be derived from a bacterial pilin or adhesin. The pilus-protein 
portion of such a novel polypeptide will commonly be formed from the amino 
acid sequence of a native pilus-protein molecule, such as that making up the 

30 normal pilus from a bacterial cell (including all of the pilus proteins, including 
adhesins, just mentioned for the immunogenic complexes of the invention). 
Such bacterial cells may be any bacterial cells that have pilus structures, 
preferably from the enterobacteriaceae family, most especially Escherichia coli. 
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In forming such polypeptides, especially immunogenic polypeptides, the 
missing strand of the pilus protein can be compensated for using any 
appropriate donor strands, regardless of the source, so long as said donor 
strands will provide a functional donor complement (wherein the term 
5 "functional" means that the donor complementary strand, when attached to 
the C-terminus of the adhesin sequence, will permit the adhesin to assume its 
native three dimerisional conformation without the need for the entire 
chaperone, or other separate conformation-directing molecule, to be present). 
In general, any sequence may be used if it can be modeled to fit into the 
10 groove. 



For example, where an adhesin sequence provides the pilus-protein 
portion of the polypeptides of the present invention said adhesin will most 
preferably be the FimH protein found In bacterial cells such as £1 colL Thus, 
15 the amino acid sequence of RmH might conceivably vary slightly from one 
strain to another so that all such sequences are contemplated by the 
polypeptides of the invention. Most preferably, the FimH sequence of SEQ ID 
IMO: 1 forms the adhesin portion, segment or fragment of the polypeptides of 
the invention when the pilus-protein is an adhesin. 

20 

As used herein, the terms "portion," "segment," and "fragment," refer 
to a continuous sequence of residues, such as amino acid residues, which 
sequence forms a subset of a larger sequence. For example, if a polypeptide 
were subjected to treatment with any of the common endopeptidases, such as 
25 trypsin or chymotrypsin, the oligopeptides resulting from such treatment would 
represent portions, segments or fragments of the starting polypeptide. 



Where the donor complement portion is derived from a chaperone, 
especially a periplasmic bacterial chaperone, said chaperone is preferably FimC 
30 or PapD, and most preferably where the chaperone or pilin is derived from a 
bacterial cell of the family enterobacteriaceae, especially E co/i. and most 
preferably where the donor complement is the Gl ^strand of FimC or PapD 
(see Figure 2 and Rgure 6, respectively, for location of Gl sequence) or an 
amino terminal extension sequence from a pilin such as FimG, especially where 
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said N-terminal sequence comprises at least. 



The donor complement, or donor strand, portion of the polypeptides of 
the Invention, or immunogenic complexes of the invention, may contain any 
5 amino acid, or other polymeric, sequence that provides appropriate 
compensation for the missing strand of the pilis protein. The appropriate strand 
to use as the donor complement is readily determined using the teaching of the 
present disclosure. Thus, to find a useful donor strand to complement a 
selected pilus protein, it is necessary to model the selected pilus protein with 
10 the appropriate chaperone and thereby determine the location and extent of 
the complementary sequence to use in forming the appropriate donor strand. 
Of course, the amino acid sequence of the relevant strands must be known. 

Amino acid sequences for the pilus proteins (both pilins and adhesins) 
15 and chaperones recited herein are provided in Figures 1, 2, 5, and 6. The 
corresponding SEQ ID NOs are tabulated in Table 1 . 
Table 1. 



Protein 


Type Protein 


Rgure No. 


SEQ ID NO. 


RmH 


Adhesin 


1 


1 


RmC 


Chiaperone 


2 


2 


FimG 


Pilin 


1 


9 


FlmA 


PIlin 


1 


12 


FimF 


Piiin 


1 


13 


PapO 


Chaperone 


6 


14 


PapK 


Pilin 


5 


15 


PapA 


PilIn 


5 


16 


PapE 


Pilin 


5 


17 


PapF 


Pilin 


5 


18 


PapG 


Adhesin 




19 



20 
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It should be kept in mind that the sequences provided for these proteins 
in the Sequence Listing can vary from species to species and from strain to 
strain within a species so that the sequences provided herein represent only 
one particular polypeptide. For example, SEQ ID WO: 19, for PapG, is for 
5 PapGIt, which specifically binds sugars on human epithelial cells. The latter 
sequence also shows the signal sequence for secretion (which comprises the 
first 20 residues at the N-terminal end). 

It should also be clearly stated that the present Invention is equally 
10 operable with any proteins, and not just similar proteins, and regardless of the 
organism, so long as a desired conformation of the protein, preferably but not 
necessarily the native one, is achieved by donor strand complementationas 
disclosed herein. Thus, the mechanism of the present invention is directed to 
proteins that require a strand donated by another protein to achieve a given 
15 conformation, such as the native conformation, and is not necessarily 
restricted to use with the bacterial proteins used herein to describe the present 
invention. Thus, the mechanism described herein may be a general one. 

In selected embodiments of the present invention, an appropriately 
20 selected donor strand is engineered onto the pilus protein to be complemented, 
either by synthesizing said pilus protein with the donor strand as part of the 
amino acid sequence, or by expressing a gene encoding the complemented 
structure (the so-called "dsc-pilus protein"), or by forming some other 
attachment, with the only requirement being that such dsc-pilus protein have 
25 the native three dimensional shape characteristic of the pilus protein as it 
occurs within the pilus, a structure stabilized in vivo by the presence of the 
chaperone. Such is readily determined by modeling according to the procedure 
described in the examples provided herein and in the description of the relevant 
protein-chaperone complexes disclosed hereinabove. 

30 

In one embodiment of the present invention, a donor strand amino acid 
sequence engineered at the C-terminal end of FimH, or other pilus-protein, 
using the 15*mer (or 15-residue) Gl fragment of FimC, with this G1 segment 
attached to the C-terminus of FimH using a linker sequence of from 1 to 20 
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amino acids in length, preferably about 3 to 10 residues in length, with about 
4 residues being an especially preferred embodiment. In another specific 
embodiment the amino terminal extension of a pilin, especially FimG, is used 
as the donor strand. When such an N-terminal sequence of a pilin is used to 
5 form the donor strand, such a sequence normally comprises a sequence of 
about 6 or more amino acids, preferably about 6 to 20 amino acids, most 
preferably about 8 to 18 amino acids, especially 8 to 17 amino acids, where 
said sequence is identical to, or derived from, any of the N-terminal extensions 
of pilus subunits assembled by PapD-like chaperones. Such sequences could 

10 have amino acid substitutions at one or more positions to enhance stability, 
solubility, or other properties. Thus, the present invention encompasses any 
such sequences, especially those with two or more alternating hydrophobic 
residues, that completes the fold of and thus stabilizes the pilus subunit. In 
short, any sequence that accepts the donor strand complement can be 

15 complemented by the strands of the invention herein and any strand that 
completes the groove in the recipient protein to achieve the natural 
conformation can be used. The usefulness of such sequences will be most 
advantageous where one or more of the amino acids of the native G1 
sequence have been replaced by amino acids of the same character 

20 (hydrophobictty, acidity, basicity, etc.). Consequently, the donor strands useful 
within the present invention do not have to be derived from a natural source 
but may be wholly synthetic in nature and character so long as they 
complement the recipient protein to produce the desired overall conformation, 
regardless of the use to which the donor strand complemented protein is to be 

25 put (so that it need not be for use as a vaccine but may have some other 
utility). 

In an especially preferred embodiment, the first 1 3 amino acids of FimG 
(see Figure 1 arKl SEQ ID NO: 9) are linked to the C-terminal end of FimH (SEQ 
30 ID NO: 1) with the two sequences separated by a tetrapeptide linker composed 
of the sequence DNKQ (Asp-Asn-Lys-GIn) with the three segments 

pilus protein - linker - donor strand 
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linked together by conventional peptide bonds. The production of such a 
sequence (using FimH as pilus protein to form dscRmH) is provided In Example 
1 below, 

5 In addition, the novel polypeptides of the invention also include 
polypeptides having sequence homology, possibly less than 20% sequence 
identity, with the sequences of pilus proteins such as FtmH, FimA, FimG, FimF, 
PapG, PapK, PapA, PapE, and PapF with the appropriate donor strand attached 
thereto to form the corresponding dsc-protelns with an optional linker 

10 sequence (for example, the Nl-terminal extension of the appropriate subunit 
fused to the corresponding protein to be complemented). For example, PapE 
and PapK have less than 20% sequence identity). The methods of the present 
invention are successful with any proteins assembled by the chaperone-usher 
pathway (i.e, assembly of a type 1 pilus). In addition, the replacement of 

15 selected amino acids as a means of increasing antigenic ability, or to increase 
or decrease the extent of other properties, is considered well within the skill of 
those in the art. Because the presence of FimC is not required to prepare the 
polypeptides of the Invention, selected segments of the FimH portion that may 
be required for interaction and/or binding with FimC and serve no other 

20 purpose may advantageously be either eliminated or replaced with other amino 
acids, even those of different character. It Is therefore deemed well within the 
skill of those in the art to make appropriate substitutions that may increase 
solubility, or any other desirable property, without sacrificing antigenicity. 

25 It should be noted that. In accordance with the present Invention, the 

term "percent identity" or "percent Identical," when referring to a sequence, 
means that a sequence Is compared to a claimed or described sequence after 
alignment of the sequence to be compared (the "Compared Sequence") with 
the described or claimed sequence (the "Reference Sequence"). The Percent 

30 Identity is then determined according to the following formula: 

Percent Identity = 100ll-(C/R)l 
wherein C is the number of differences between the Reference Sequence and 
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the Compared Sequence over the length of alignment between the Reference 
Sequence and the Compared Sequence wherein (i) each base or amino add in 
the Reference Sequence that does not have a corresponding aligned base or 
amino acid in the Compared Sequence and (ii) each gap in the Reference 
5 Sequence and (iii) each aligned base or amino acid in the Reference Sequence 
that is different from an aligned base or amino acid in the Conrxpared Sequence, 
constitutes a difference; and R is the number of bases or amino acids in the 
Reference Sequence over the length of the alignment with the Compared 
Sequence with any gap created in the Reference Sequence also being counted 

10 as a base or amino acid. 

If an alignment exists between the Compared Sequence and the 
Reference Sequence for which the percent identity as calculated above is 
about equal to or greater than a specified minimum Percent Identity then the 
Compared Sequence has the specified minimum percent identity to the 

15 Reference Sequence even though alignments may exist in which the 
hereinabove calculated Percent Identity is less than the specified Percent 
Identity. 

The present invention is not limited to singly engineered proteins but 
20 includes also engineered protein complexes. A non-limiting example of such a 
complex would be a complex formed between FimH and donor complemented 
FimG. In such a complex, FimH (i.e., a native FimH without the donor strand 
added and thus having an exposed tail groove) would be complexed with a 
FimG. The amino terminal extension of the latter pilus-protein would then 
25 insert into the exposed tail groove of FimH but would be prepared by the 
methods disclosed herein having a Gl p-strand of FimC, or other suitable 
donor complement strand, such as the N-terminal extension of FumF, attached 
to the FimG C-terminus (with the donor strand in its forward or reverse 
sequence orientation). These proteins would thereby be combined to form a 
30 self-assembling FimG-H complex, such as a dimer, especially a heterodimer, 
useful in an immunogenic composition, such as a vaccine. Donor 
complemented FimA (dscfimA) may combine with FimH in the same way that 
FimG does and forms a similar complex of an adhesin and a donor- 
complemented pilus-protein. 

35 
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An especially preferred embodiment of the present invention comprises 
a polypeptide having an amino acid sequence comprising the amino acid 
sequence of FimH (or some variant thereof) as the pilus-protein portion and the 
sequence of the donor complement, such as the G1 p-strand of FimC or the 

5 amino terminal extension of FimG, especially where the Gl sequence of FimC 
(but not the amino terminal extension sequence of FimG) is inverted and 
attached through an amino acid linker to the C-terminal of FimH. Such linker 
may be composed of different amino acids, especially sequences capable or 
readily forming a loop structure so as to cause the donor strand to loop back 

0 toward the pilus protein and form an anti-parallel structure in place of the 
missing strand. Such a preferred embodiment is presented as SEQ ID NO: 10 
wherein the linker is alternating serine/glycines and the donor strand is from 
FimC (SEQ ID NO:3). A similar preferred embodiment with FimH as the pilus- 
protein and the N-terminal extension of FimG separated by a DNKQ linker 

5 sequence is shown in SEQ ID NO: 11 (prepared in Example 1 below). An 
embodiment with PapG (SEQ ID NO: 19), DNKQ linker, and donor strand from 
PapF (SEQ ID NO: 20) is shown as SEQ ID NO: 21. 

Donor strand sequences according to the present invention inchide the 
0 following: 

Gl p-strand of FimC: 



Forward: NH2-TLQLAIISRIKLYYR-COOH 
Reverse: NHa-RYYLKIRSIlALQLT-COOH 



SEQ ID NO: 4 



SEQ ID NO: 3 



N-Terminal sequence of PapF: 



NH2-DVQINIRGNVYIP-COOH 



SEQ ID NO: 20 



Gl p-strand of PapD: 



Forward: NH2-DVTITVNGKVVAKP-COOH 
Reverse: NHa-PKAVVKGNVTITVD-COOH 



SEQ ID NO: 6 



SEQ ID NO: 5 
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N-terminal extension of FimG: 

NH2-DVTITVNGKVVAK-C00H SEQ ID NO: 7 

5 

N-terminai extension of RmF: 

NH2-DSTmRGYVRDN-C00H SEQ ID NO: 8 

10 

In keeping with the spirit and flexibility of the disclosure herein, any 
sequence that stabilizes the pilin or adhesin can be utilized advantageously as 
the donor strand or donor complement. As shown In Example 2, below, the 
donor strand complemented FimH (dscFimH) produced according to this 
15 particular embodiment was expressed in the absence of the chaperone in the 
periplasm and exhibited the properties of native FimH. 

Regardless of the particular donor strand used as the complement in 
forming the dsc-pilus-protein structure, several structural requirements must be 

20 clearly borne in mind and may be considered "rules" for the selection of a 
suitable donor strand for donor complementation of a selected pilus protein 
according to the invention herein. For one thing, the doru>r strand binds to the 
ptius'protein structure in an anti-parallel arrangement so that it will commonly 
be preferred to attach the donor strand from a chaperone at the C-terminus, 

25 for example, of an adhesin, in a reverse orientation so that when it bends bacic 
it permits an anti-parallel orientation or in a forward direction In the case of an 
N-terminal sequence of a pilus subunit as donor strand, again to facilitate the 
desired anti-parallel orientation. In addition, because any such donor strand, 
regardless of source, should be permitted to bend back in order to form the 

30 appropriate strand complementation activity, the donor strand and the C- 
terminus of the pilus protein portion of a polypeptide of the present invention 
will usually be separated by an appropriate loop, or other suitable amino acid 
sequence. Such a looped linking structure could comprise a sequence of up to 
20 amino acids, especially 1 to 10 residues, for example, about 4 or 5 
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residues, for example, the sequence of residues that occurs In the Fl-Gl loop 
of the FimC chaperone, or especially useful would be the B1-C1 loop of PapD. 
Because the amino acid chain of the novel polypeptides of the invention are 
self-assembling, the donor strand will find its way to the appropriate 
5 complementation strand with no further guidance required, so long as 
sufficient conformational flexibility is provided to the engineered adhestn as a 
whole. 

tt is contemplated that the polypeptides of the present invention may be 
10 in isolated or purified form. 

'"Isolated" in the context of the present invention with respect to 
polypeptides (or polynucleotides) means that the material is removed from its 
original environment {e.g,, the cells used to recombinantly produce the 
15 polypeptides disclosed herein). Such peptides could be part of a composition, 
and still be isolated in that such vector or composition is not part of its natural 
environment. The polypeptides and polynucleotides of the present invention 
are preferably provided in an isolated form, and preferably are purified to 
homogeneity. 

20 

The recombinant and/or immunogenic polypeptides, disclosed in 
accordance with the present invention, may also be in "purified" form. The 
term "purified" does not require absolute purity; rather, it is intended as a 
relative definition, and can include preparations that are highly purified or 
25 preparations that are only partially purified, as those terms are understood by 
those of skill in the relevant art. For example, polypeptides from individual 
clones isolated from a cDNA library have been conventionally purified to 
eiectrophoretic homogeneity. Purification of starting material or natural 
material to at least one order of magnitude, preferably two or three orders, 
0 and more preferably four or five orders of magnitude is expressly 
contemplated. Furthermore, claimed polypeptides having a purity of 
preferably 0.001%, or at least 0.01% or 0.1%, and even desirably 1% by 
weight or greater is expressly contemplated. 
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For purposes of recombinantly producing the polypeptides of the 
invention, the term ''expression product" means that polypeptide or protein 
that is the natural translation product of the gene and any nucleic acid 
sequence coding equivalents resulting from genetic code degeneracy and 
5 thus coding for the same amino actd(s). 

Thus, the polypeptides of the present invention may also be present in 
the form of a composition. Such composition, where used for pharmaceutical 
purposes, will commonly have the polypeptide of the present invention 
10 suspended in a pharmacologically acceptable diluent or excipient. 

The present invention is also directed to polynucleotides capable of 
coding for the polypeptides of the invention, especially polynucleotides 
encoding the amino acid sequence of SEQ ID NO: 1 1 . Such polynucleotides 
15 would therefore contain at least one coding region for the polypeptides of the 
present invention, which would thus be an expression product thereof. 

As used herein, the term "coding region'* refers to that portion of a 
gene which either naturally or normally codes for the expression product of 
20 that gene in its natural genomic environment, i.e., the region coding in vivo 
for the native expression product of the gene. -The coding region can be from 
a normal, mutated or altered gene, or can even be from a DNA sequence, or 
gene, wholly synthesized in the laboratory using methods' well known to 
those of skill in the art of DNA synthesis. 

25 

In accordance with the present invention, the term "nucleotide 
sequence** refers to a heteropolymer of deoxyribonucleotides. Generally, 
DNA segments encoding the proteins provided by this invention are 
assembled from cDNA fragments and short oligonucleotide linkers, or from a 
30 series of oligonucleotides, to provide a synthetic gene which is capable of 
being expressed in a recombinant transcriptional unit comprising regulatory 
elements derived from a microbial or viral operon. 

The term "expression product'' means that polypeptide or protein that 
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is the natural translation product of the gene and any nucleic add sequence 
coding equivalents resulting from genetic code degeneracy and thus coding 
for the same amino acid(s). 

5 The term "primer* means a short nucleic acid sequence that is paired 

with one strand of DNA and provides a free 3'OH end at which a DNA 
polymerase starts synthesis of a deoxyribonucleotide chain. 

The term "promoter" means a region of DNA involved in binding of 
10 RNA polymerase to initiate transcription. 

As used herein, reference to a DNA sequence includes both single 
stranded and double stranded DNA. Thus, the specific sequence, unless the 
context indicates otherwise, refers to the single strand DNA of such 
15 sequence, the duplex of such sequence with its complement (double 
stranded DNA) and the complement of such sequence. 

The present invention is also directed to antibodies specific for, and 
antisera generated In response to, polypeptides of the invention. Such 

20 antibodies may be either polyclonal or monoclonal and may be generated, 
where monoclonal, from a cell, especially a hybridoma cell, by standard 
methods in the art. In addition, the present invention also relates to cells, and 
cell lines, genetically engineered to produce such antibodies after being 
transfected, or otherwise transformed, so that their genomes contain, within 

25 the main chromosome or as part of a plasmid or other vector, a 
polynucleotide encoding the genes for an antibody specific for a polypeptide 
of the invention, especially where said engineered cell is a cell capable of 
forming and secreting a fully formed antibody, such technology being known 
in the art. 

30 

The present invention also relates to vectors, such as plasmids, 
comprising the polynucleotides of the invention, said polynucleotides 
encoding polypeptides disclosed herein, and wherein such vectors are useful 
for transforming cells and permitting said transformed cells to express the 
35 polypeptides of the invention. 
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The present invention also relates to cells transformed by such 
vectors and thereby expressing, with or without subsequent secretion 
thereof, of the polypeptides of the Invention. 

5 

The present invention is also directed to vaccines and vaccine 
compositions comprising the polypeptides disclosed herein. Such a vaccine 
would comprise a composition containing an immunogenically effective 
amount of a polypeptide of the invention. A preferred embodiment of the 
10 invention is a vaccine composition comprising the polypeptide whose 
sequences are shown as SEQ ID NO: 10 and 11. 

It is an object of the present invention to utilize an immurK>genic 
composition for a vaccine (or to produce antibodies for use as a diagnostic or 
15 as a passive vaccine) comprising a bacterial polypeptide of the invention. In 
one embodiment, proteins and fragments (naturally or recomblnantly produced, 
as well as functional analogs) from bacteria that produce type 1 or type P pili 
are contemplated. Even more particularly, E. co/i is contemplatckl as the 
source. 

20 

In another aspect of the invention, such an immunogenic composition 
may be utilized to produce antibodies to diagnose urinary tract infections, or to 
produce vaccines for prophylaxis and/or treatment of such infections as well as 
booster vaccines to maimain a high titer of antibodies against the 
25 immunogen(s) of the immurK>genic composition. 

While other antigens have been utilized to produce antibodies for 
diagnosis and for the prophylaxis and/or treatment of bacterial urinary tract 
infections, there is a need for improved or more efficient vaccines. Such 
30 vaccines should have an improved or enhanced effect in preventing bacterial 
infections mediated by pilus proteins. 

There is a need for improved antigenic compositions comprising 
adhesins and pilins for stimulating high-titer specific antisera to provide 
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protection against infection by pathogenic bacteria and also for use as 
diagnostic reagents. 

In one aspect, the present invention is directed to an immunogenic 
5 composition comprising a purified dsc-pilus protein polypeptide or 
immunogenic complex thereof. A specific embodiment comprises a native 
adhesin, preferably FimH, and a donor complement, such as one derived from 
a periplasmic chaperone, preferably FimC, most preferably the G1 strand of 
FimC, or an amino terminal extension of a pilin, preferably FimG, most 

10 preferably no more than the first 1 7 N-terminal residues of FimG, especially the 
first 1 3 residues thereof, with the dsc-pilus-protein maintained in the complex 
in an immunogenic form capable of inducing an immune response when 
appropriately introduced into a human or other mammalian species. Thus, a 
preferred embodiment is one that includes the N-terminal extension of another 

15 subunit. 

The dsc-potypeptides and complexes of the present invention are 
primarily intended for use as vaccines. Generally, vaccines are prepared as 
injectables, in the form of aqueous solutions or suspensions. Vaccines in an oil 

20 base are also well known such as for inhaling. Solid forms which are dissolved 
or suspended prior to use may also be formulated. Pharmaceutical carriers are 
generally added that are compatible with the active ingredients and acceptable 
for pharmaceutical use. Examples of such carriers Include, but are not limited 
to, water, saline solutions, dextrose, or glycerol. Combinations of carriers may 

25 also be used. 

Vaccine compositions may further incorporate additional substances to 
stabilize pH, or to function as adjuvants, wetting agents, or emulsifying 
agents, which can serve to improve the effectiveness of the vaccine. 

30 

Vaccines are generally formulated for parenteral administration and are 
injected either subcutaneously or intramuscularly. Such vaccines can also be 
formulated as suppositories or for oral administration, using methods known in 
the art. 
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The amount of vaccine sufficient to confer immunity to pathogenic 
bacteria Is determined by methods well known to those skilled in the art. This 
quantity will be determined based upon the characteristics of the vaccine 
5 recipient and the level of immunity required. Typically, the amount of vaccine 
to be administered will be determined based upon the judgment of a skilled 
physician. Where vaccines are administered by subcutaneous or intramuscular 
injection, a range of 50 to 500 ^g purified protein may be given. 

10 In addition to use as vaccines, the polypeptides of the present 

invention, and immunogenic fragments thereof, can be used as immunogens to 
stimulate the production of antibodies for use in passive immunotherapy, for 
use as diagnostic reagents, and for use as reagents in other processes such as 
affinity chromatography. 

15 

The present invention also provides for a recombinant production or 
synthesis of the proteins and polypeptides of the invention without the need 
for any chaperone being present or the need to co-express any chaperone 
during production of the replacement adhesin for use as a vaccine (or as an 
20 immunogen to produce antibodies for diagnostic or therapeutic purposes). 

Recombinant polypeptides of the invention are readily produced by 
methods of genetic engineering already well known in the art (e.g.. Example 1 , 
below) or by direct synthesis by well known, even automated, methods. 
25 Therefore, a compendium of procedures for preparing the polypeptides and 
complexes of the invention need not be recited herein. 

In addition to producing a genetically engineered or synthetic sequence 
for the complete adhesin-donor complement polypeptide, it is also possible to 
30 attach the appropriate donor strand fragment at or near the COOH-end of the 
adhesin chain by some chemical linker other than a conventional oligopeptide 
using a standard peptide bond. Such chemically fused structures are 
contemplated by the present invention, the nature of such structures being 
limited only by the imagination of chemists seeking to produce functional 
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polypeptides of the invention. Such linlcing structures also include standard 
polymers forming an appropriate looping structure or may be by any of the 
non-covalent interactions listed above. 

5 Thus, in accordance with the present invention, the ability to engineer a 

recombinant adhesin-donor complement protein, a dsc-pilus polypeptide, 
without the requirement of co-producing a chaperone, or adding an exogenous 
chaperone, or fragments thereof, to the expression medium, to enable the 
completion of the adhesin or pilin native structure and conformation readily 

10 permits large scale synthesis of immunogenic polypeptides for use as vaccines. 

The polynucleotides encoding the polypeptides of the invention may 
also have the coding sequence fused in frame to a marker sequence which 
allows for purification of the polypeptides of the present invention. The marker 
sequence may be, for example, a hexa-histidine tag supplied by a pQE-9 vector 

15 to provide for purification of the mature polypeptides fused to the marker in 
the case of a bacterial host, or, for example, the marker sequence may be a 
hemagglutinin (HA) tag when a mammalian host, e.g. COS-7 cells, is used. 
The HA tag corresponds to an epitope derived from the influenza 
hemagglutinin protein (Wilson, I., et al.. Cell, 37:767 (1984)). 

20 

When host cells are genetically engineered (transduced or transformed 
or transfected) with the vectors comprising a polynucleotide encoding a 
polypeptide of the present invention, the vector may be, for example, a cloning 
vector or an expression vector. The vector may be, for example, in the form 

25 of a plasmid, a viral particle, a phage, etc. The engineered host cells can be 
cultured in conventional nutrient media modified as appropriate for activating 
promoters, selecting transformants or amplifying the polynucleotides which 
encode such polypeptides. The culture conditions, such as temperature, pH 
and the like, are those previously used with the host cell selected for 

30 expression, and will be apparent to the ordinarily skilled artisan. 

Vectors include chromosomal, nonchromosomal and synthetic DIMA 
sequences, e.g., derivatives of SV40; bacterial plasmids; phage DNA; 
baculovirus; yeast plasmids; vectors derived from combinations of plasmids 
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and phage DNA, viral DNA such as vaccinia, adenovirus, fowl pox virus, and 
pseudorabies. However, any other vector may be used as long as it is 
replicable and viable in the host. 

5 The appropriate DNA sequence may be inserted into the vector by a 

variety of procedures. In general, the DNA sequence is inserted into an 
appropriate restriction endonuclease site(s) by procedures known in the art. 
Such procedures and others are deemed to be within the scope of those skilled 
in the art. 

10 

The DNA sequence In the expression vector is operatively linked to an 
appropriate expression control sequence(s) (promoter) to direct mRNA 
synthesis. As representative examples of such promoters, there may be 
mentioned: LTR or SV40 promoter, the E. coll. lac or trg, the phage lambda 
15 promoter and other promoters known to control expression of genes in 
prokaryotic or eukaryotic cells or their viruses. The expression vector also 
contains a ribosome binding site for translation Initiation and a transcription 
terminator. The vector may also include appropriate sequences for amplifying 
expression. 

20 

In addition, the expression vectors preferably contain one or more 
selectable marker genes to provide a phenotypic trait for selection of 
transformed host cells such as dihydrofolate reductase or neomycin resistance 
for eukaryotic cell culture, or such as tetracycline or ampictllin resistance in E. 
25 coll . 

The vector containing the appropriate DNA sequence as hereinabove 
described, as well as an appropriate promoter or control sequence, may be 
employed to transform an appropriate host to permit the host to express the 
30 proteins. 

As representative examples of appropriate hosts, there may be 
mentioned: bacterial cells, such as E. coli, Streptomyces, Salmonella 
typhlmurium; fungal cells, such as yeast; insect cells such as Drosophila S2 
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and Spodoptera Sf9; animal ceiis such as CHO, COS or Bowes melanoma; 
adenoviruses; plant cells, etc. The selection of an appropriate host is deemed 
to be within the scope of those skilled in the art from the teachings herein. 



5 More particularly, the present invention also includes recombinant 

constructs comprising one or more of the sequences as broadly described 
above. The constructs comprise a vector, such as a plasmid or viral vector, 
into which a sequence of the invention has been inserted, in a forward or 
reverse orientation. In a preferred aspect of this embodiment, the construct 

10 further comprises regulatory sequences, including, for example, a promoter, 
operably linked to the sequence. Large numbers of suitable vectors and 
promoters are known to those of skill in the art, and are commercially 
available. The following vectors are provided by way of example. Bacterial: 
pQETO, pQE60, pQE-9 (Qiagen, Inc.), pbs, pDIO, phagescript, psiX174, 

15 pbluescript SK, pbsks, pNHBA, pNHIBa, pNHIBA, pNH46A (Stratagene); 
ptrc99a, pKK223-3, pKK233-3, pDR540, pRITS (Pharmacia). Eukaryotic: 
pWLNEO, PSV2CAT, p0644, pXTI, pSG {Stratagene) pSVK3, pBPV, pMSG, 
pSVL (Pharmacia). However, any other plasmid or vector may be used as long 
as they are replicable and viable in the host. 

20 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. 
Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial 
promoters include lad, lacZ, T3, T7, gpt, lambda Pr, and TRP. Eukaryotic 
25 promoters include CMV immediate early, HSV thymidine kinase, early and late 
SV40, LTRs from retrovirus, and mouse metallothionein-l. Selection of the 
appropriate vector and promoter is well within the level of ordinary skill in the 
art. 



30 Host cells containing the above-described constructs can be a higher 

eukaryotic cell, such as a mammalian ceil, or a lower eukaryotic cell, such as a 
yeast cell, or the host cell can be a prokaryotic cell, such as a bacterial cell. 
Introduction of the construct into the host cell can be effected by calcium 
phosphate transfection, DEAE-Dextran mediated transfection, or 
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electroporation (Davis, Dtbner, M^, Battey, l.r Basic Methods in Molecular 
Biology, (1986)). 

The constructs in host cells can be used in a conventional manner to 
5 produce the gene product encoded by the recombinant sequence. 
Alternatively, the polypeptides of the invention can be synthetically produced 
by conventional peptide synthesizers. 

Mature proteins can be expressed in mammalian cells, yeast, 
10 bacteria, or other cells under the control of appropriate promoters. Cell-free 
translation systems can also be employed to produce such proteins using 
RNAs derived from the DNA constructs of the present invention. Appropriate 
cloning and expression vectors for use with prokaryotic and eukaryotic hosts 
are described by Sambrook, et aL, Molecular Cloning: A Laboratory Manual, 
15 Second Edition, Cold Spring Harbor, N.Y., (1989), Wu et al. Methods in Gene 
Biotechnology (CRC Press, New York, NY, 1997), and Recombinant Gene 
Expression Protocols, in Methods in Molecular Biology, Vol. 62, (Tuan, ed., 
Humana Press, Totowa, NJ, 1997), the disclosures of which are hereby 
incorporated by reference. 

20 

Transcription of the DNA encoding the polypeptides of the present 
invention by higher eukaryotes is increased by inserting an enhancer sequence 
into the vector. Enhancers are cis-acting elements of DNA, usually about from 
10 to 300 bp that act on a promoter to increase its transcription. Examples 
25 including the SV40 enhancer on the late side of the replication origin bp 100 to 
270, a cytomegalovirus early promoter enhancer, the polyoma enhancer on the 
late side of the replication origin, arui adenovirus enhancers. 

Generally, recombinant expression vectors will include origins of 
30 replication and selectable markers permitting transformation of the host cell, 
e.g., the ampiciliin resistance gene of £ coli and S. cerevisiae TRPl gene, and 
a promoter derived from a highly-expressed gene to direct transcription of a 
downstream structural sequence. Such promoters can be derived from 
operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase 
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(PGK), a-factor, add phosphatase, or heat shock proteins, among others. The 
heterologous structural sequence is assembled in appropriate phase with 
translation initiation and termination sequences. Optionally, the heterologous 
sequence can encode a fusion protein including an N-terminal identification 
5 peptide imparting desired characteristics, e.g., stabilization or simplified 
purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting 
a structural DNA sequence encoding a desired protein together with suitable 

10 translation initiation and termination signals in operable reading phase with a 
functional promoter. The vector will comprise one or more phenotypic 
selectable markers and an origin of replication to ensure maintenance of the 
vector and to, if desirable, provkie amplification within the host. Suitable 
prokaryotic hosts for transformation include £ coli. Bacillus subtills. Salmonella 

15 typhimurium and various species within the genera Pseudomonas, 
Streptomyces, and Staphylococcus, although others may also be employed as 
a matter of choice. 

As a representative but non-limiting example, useful expression vectors 
20 for bacterial use can comprise a selectable marker and bacterial origin of 
replication derived from commercially available plasmids comprising genetic 
elements of the well known cloning vector pBR322 (ATCC 37017). Such 
commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM1 (Promega Biotec, Madison, Wl, USA). 
25 These pBR322 "backbone" sections are combined with an appropriate 
promoter and the structural sequence to be expressed. 

Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced by 
30 appropriate means (e.g., temperature shift or chemical induction) and cells are 
cultured for an additional period. 

Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further 
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purification. 

Microbial ceils employed in expression of proteins can be disrupted by 
any convenient method, including freeze-thaw cycling, sontcation, a french 
5 press, mechanical disruption, or use of cell iysing agents, such methods are 
well know to those skilled in the art. 

Various mammalian cell culture systems can also be employed to 
express recombinant protein. Examples of mammalian expression systems 

10 include the COS-7 lines of monkey kidney fibroblasts, described by Gluzman, 
Cell, 23:175 (1981), and other cell lines capable of expressing a compatible 
vector, for example, the CI 27, 3T3, CHO, HeLa and BHK cell lines. 
Mammalian expression vectors will comprise an origin of replication, a suitable 
promoter and enhancer, and also any necessary ribosome binding sites, 

15 polyadenyiation site, splice donor and acceptor sites, transcriptional 
termination sequences, and 5' flanking nontranscribed sequences. DNA 
sequences derived from the SV40 splice, and polyadenyiation sites may be 
used to provide the required nontranscribed genetic elements. 

20 The polypeptides can be recovered and/or purified from recombinant cell 

cultures by well-known protein recovery and purification methods. Such 
methodology may include ammonium sulfate or ethanol precipitation, acid 
extraction, anion or cation exchange chromatography, phosphoceliulose 
chromatography, hydrophobic interaction chromatography, affinity 

25 chromatography, hydroxylapatite chromatography and lectin chromatography. 
Protein refolding steps can be used, as necessary, in completing configuration 
of the mature protein. In this respect, chaperones may be used in such a 
refolding procedure. Finally, high performance Bquid chromatography (HPLC) 
can be employed for final purification steps. 

30 

The polypeptides that are useful as immunogens in the present 
invention may be a naturally purified product, or a product of chemical 
synthetic procedures, or produced by recombinant techniques from a 
prokaryotic or eukaryotic host (for example, by bacterial, yeast, higher plant. 
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insect and mammalian cells in cuhure). Depending upon the host employed in 
a recombinant production procedure, the polypeptides of the present invention 
may be glycosylated or may be non-glycosylated. Particularly preferred 
immunogens are FimH-p-strand polypeptides or mannose-binding fragments 
5 thereof since FimH is highly conserved among many bacterial species. 
Therefore, antibodies against FimH (or its mannose-binding fragments) should 
bind to FimH of other bacterial species (in addition to £ coli) and vaccines 
against E, coli FimH (or FimH mannose-binding fragments) should give 
protection against other bacterial infections in addition to E. coli infections (for 
10 example, against other enterobacteriacea infections). 

The polypeptides, their fragments or other derivatives, or analogs 
thereof, or cells expressing them can also be used as an immunogen to 
produce antibodies thereto. These antibodies can be, for example, polyclonal 
15 or monoclonal antibodies. The present invention also includes chimeric, single 
chain, and humanized antibodies, as well as Fab fragments, or the product of 
an Fab expression library. Various procedures known in the art may be used 
for the production of such antibodies and fragments. 

20 Antibodies generated against the polypeptides corresponding to a 

sequence of the present invention can be obtained by direct injection of the 
polypeptides into an animal or by administering the polypeptides to an animal, 
preferably a nonhuman. The antibody so obtained will then bind the 
polypeptides itself. In this manner, even a sequence encoding only a fragmmt 

25 of the polypeptides can be used to generate antibodies binding the whole 
native polypeptides. 

For preparation of monoclonal antibodies, any technique which provides 
antibodies produced by continuous cell line cultures can be used. Examples 
30 include the hybridoma technique (Kohler and Milstein, 1975, Nature, 256:495- 
497), the trioma technique, the human B-cell hybridoma technique (Kozbor et 
al., 1983, Immunology Today 4:72), and the EBV-hybridoma technique to 
produce human monoclonal antibodies (Cole, et al., 1985, in Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 
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Techniques described for the production of single chain antibodies (U.S. 
Patent 4,946,778} can be adapted to produce single chain antibodies to 
immunogenic polypeptide products of this invention* Also, transgenic mice 
5 may be used to express humanized antibodies to immunogenic polypeptide 
products of this invention. In addition, cells can be transformed with gene 
sequences corresponding to antibody chains containing variable regions 
complementary to the polypeptides of the invention and thereby generate 
engineered antibodies to the polypeptides disclosed herein. 



In carrying out the procedures of the present invention it is of course 
to be understood that reference to particular buffers, media, reagents, cells, 
culture conditions and the like are not intended to be limiting, but are to be 
read so as to include all related materials that one of ordinary skill in the art 

15 would recognize as being of interest or value in the particular context in 
which that discussion is presented. For example. It is often possible to 
substitute one buffer system or culture medium for another and still achieve 
similar, if not identical, results. Those of skill in the art will have sufficient 
knowledge of such systems and methodologies so as to be able, without 

20 undue experimentation, to make such substitutions as will optimally serve 
their purposes in using the methods and procedures disclosed herein. 

The present invention will now be further described by way of the 
following non-limiting examples. In applying the disclosure of these 
25 examples, it should be kept clearly in mind that other and different 
embodiments of the present invention will no doubt suggest themseh^es to 
those of skill in the relevant art. 



It was shown experimentally that a pilus subunit protein, otherwise 
unable to fold independently <or to fold inefficiently) due to lack of a C- 
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Example 1 



Folding of FimH in cis using a donor strand 



35 
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terminal G p-stfand in the absence of a periplasmic chaperone to provide the 
correct steric information, was able to fold correctly when the missing strand 
was provided in cfs. Here, the missing seventh p-strand was fused onto the 
3'end of RmH. Here, the DNA sequence encoding the first 13 amino acids of 
5 FimG (see Figure 1), referred to herein as the donor strand, was provided to 
FimH in c/s, by fusing it directly to the 3'-end of the FimH coding sequence, 
with the resulting DNA sequence encoding the donor strand complemented 
FimH protein (called dscFimH)Jn addition, a hairpin loop region in PapD 
consisting of Asp«Asn-Lys-GIn was inserted upstream of the donor strand to 
10 form a hinge region in the expressed protein and thereby allow the donor 
strand to fold back and form an anti-parallel arrangement with RmH (with the 
arrangement shown in Figure 9). 

To construct the dsc-RmH, the following two oligonucleotides were 
15 annealed together and ligated into the Clal and BamHI sites of pUC18-FlmH 
to create pUC18-dscFimH (here, with DNKQ as linker - the latter are the 4 
amino acids used for the loop or hinge region, as described above, using 
standard 1 letter code for amino acids) where the top (coding strand) is: 



20 5*-CGATTATTGGCGTGACTTTTGTTTATCAAGATAACAAACAGGAT6TCA 
CCATCACGGTGAACGGTAAGGTCGTCGCCAAATAAG-3' 

SEQ ID NO: 

22 . 

25 and for bottom strand: 

B'-GATCCTTATTTGQCGACGACCTTACCGTTCACCGTGATGGTGACCATC 
CTGTTTGTTATCTTGATAAACAAAAGTCACGCCAATAAT-3' 

SEQ ID NO: 

30 23 



Here, pUC18-dscFimH was sequenced followed by subcloning into the 
EcoRI and BamHI sites of pTrc99A (Amann et al. Gene 69, 301 (1988). to 
35 create pTrc-dscFimH. The f/mH was then subcloned from pUC18-FimH into 
pTrc99A using the EcoRI and BamHI sites to create pTrc-FimH. 
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FimH^ FimH + FimC, and dscFimH were expressed separately. The 
plasmids encoding the FlmH (pTrc-FimH), dscFimH (pTrc-dscFirnH), and FimC 
{pJP4) were expressed in C600 (see Jones et al, Proc. NatL Acad, ScL USA, 
5 90, 8397 {1993)). Overnight cultures were diluted 1:100 into Luria broth and 
grown to an ODgoo <>♦ 0-6 followed by induction with 0.5 mM IPTG for 1 
hour. Periplasms were prepared by a known procedure (see Slonim et al., 
EMBOJ. 11,4747(1992). 

10 The presence of FimC and FimH or dscFimH in periplasmic extracts 

was monitored by immunoblotting using anti-FimCH antibodies. In addition, 
mannose-sepharose chromatography (Jones et al, 1 993, above) was used to 
determine the ability of FimH or dscFimH to bind its receptor. FimH was 
degraded when expressed alone but was stabilized by co-expression of the 

15 chaperone. In contrast to FimH, dscFimH was stable in the periplasm in the 
absence of FimC. FimH bound to mannose-sepharose beads when it was co- 
expressed with FimC and thus etuted as a FimCH complex. Since FimH is 
degraded in the absence of FimC, no full length FimH eluted from the 
mannose-sepharose beads. In contrast, dscFimH bound to, and specifically 

20 eluted from, the mannose beads when expressed alone. When FimC was co- 
expressed with dscFimH it did not form a complex with dscFimH and thus 
did not co-elute with dscFimH from the mannose-sepharose beads. 

In contrast to FimH, dscFimH was not able to complement a fimht 
25 (FimH negative) type 1 gene cluster to restore a hemagglutination positive 
phenotype- Here, fimH and dscFimH were subcioned from pUCIS-FimH and 
pUC18-dscFimH Into pBadlS-Kn (see Guzman et al, J. BacterioL, 177, 4121 
(1995). using the EcoRI and Xbal sites to create pBad-RmH and pBad- 
dscFimH which were transformed into 0RN103/pETS10. ORN103 does not 
30 produce type 1 pili and pETSIO encodes a fimH type 1 gene cluster. The 
strains were diluted 1:100 into Luria broth and grown to an ODgoo of 0.8 
followed by induction with 0.1 mM IPTG and 0.02% arabinose for one hour. 
The cells were harvested and hemagglutination assays were performed by 
methods known in the art (see: Hultgren et al, Mol. Microbiol. 4, 1311 
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(1990). 

In accordance with the present Invention, the added donor strand 
occupied the groove and completed the Ig fold of the FimH pilin domain, 
5 thus shielding the surface that would normally Interact first with the 
chaperone and then with another subunit in the pilus. Thus, dscFimH did not 
bind FimC nor assemble into the pilus. 



FimH and dscFimH were purified and their denaturatlon curves were 
10 obtained. FimH was Isolated from a FimCH complex by incubating the 
complex in 3M urea and subjecting it to cation exchange chromatography 
which yielded 2 peaks, one of which contained pure FimH. The FimCH 
complex was purified from the periplasm of C600/pHJ9205/pHJ20 (seer 
Jones et al (1993) and Slonim et al (1992, above). The periplasmic extracts 
15 were dialyzed against 20 mM MES pH 5.4, injected onto a Source 15S 
column (Pharmacia), and FimCH eluted with 65 mM NaCL The eluate was 
injected onto a ButyWFF column In 0.55M {NH4)2SO4/20 mM MES pH 5.4 
and FimCH eluted at 0.3 M (NH4)2S04. The FimCH complex was brought to 
3M urea to separate the two subunits. Pure FimH in 3M urea was collected 
20 from the flow through of a Source 15S column. FimH retained its native 
structure in 3M urea as determined by circular dichrotsm (CD) and its ability 
to bind mannose. 

DscFimH was also purified from the periplasm of C600/pTrc*dscFimH 
25 (see Slonim et al (1992)).The periplasm was dialyzed against 20 mM Tris-CI 
pH 8.8 and dscFimH was collected from the flow through of a Source 15Q 
column. This flow through was injected onto a Butyl4FF column in 0.9M 
(NH4)2S04/20 mM Tris-CL pH 8.8 and dscFimCH eluted at 0.4 M (NH4)2S04. 
The eluate was then loaded onto a Source 15S column in 20 mM MES pH 
30 4.7 and dscFimH eluted at 55 mM NaCL 

FimH and dscFimH had similar denaturation curves with denaturatlon 
complete only at concentrations above 8.5 M urea ( + 4mM DTT) as 
determined by tyrosine fluorescence spectroscopy emission maxima (350 
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nm). For this measurement, 22.5 \xg of FimH or dscFimH In 20 mM MES pH 
.6.5 + 4 mM DTT was incubated with the appropriate urea concentration. 
Fluorescence was measured using an excitation wavelength of 290 nM with 
emission at 350 nm on an AlphaScan PTI fluorometer. FimH did not begin to 
5 denature until a concentration of 6.5 M urea was reached, with the midpoint 
of the denaturation curve occurring at approximately 7.5 M urea. 

Due to the twist in the p-sheet formed by strands D, C and F, the Gl 
strand of FimC is unable to satisfy all potential backbone hydrogen bonding 
10 interactions with the F strand of FimH. Modeling of the finished structures 
was performed using SYBYL (Tripos Associates) and Insight II (Molecular 
Simulations Inc.) running on a Silicon Graphics workstation. 



15 



20 



Example 2 
In Vitro Folding Assay 



An in vitro folding assay was used to demonstrate that the missing 
steric information in the amino acid sequence of pilus subunit proteins can be 

25 provided in cis. This assay was based on attempted refolding of urea- 
denatured FimH (obtained from a FimCH complex as already described) and 
dscFimH as determined by examination of the CD spectra after rapid dilution 
of the denatured proteins. Spectra were measured from 150 \ig of protein in 
20 mM MES pH 6.5 using a 0.02 cm cell in a JASCO J715 

30 spectropolarimeter. Denatured proteins were diluted to 0.45 M urea. 

Before denaturation, FimH (3 M urea) had a virtually identical p-sheet 
CD spectrum as compared to dscFimH. After denaturation in 9 M urea ( + 4 
mM DTT), the CD spectra of FimH and dscFimH became characteristic of 
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non-native proteins. Light scattering of the denatured dscFlmH indicated the 
presence of large aggregates. However, rapid dilution of dscFimH led to the 
refolding of the protein into its native p-sheet structure. The refolded 
dscFimH bound mannose and was monodisperse ( as shown by light 
5 scattering) indicating that it had refolded into its native structure. In contrast, 
attempts to refold FimH led to insoluble aggregates and therefore elicited no 
signal after filtering. FimC was unable to bind to denatured FimH after its 
rapid dilution. However, if FimC was present in the diluent, FimH formed a 
complex with FimC and folded into its native mannose-binding p-sheet 

10 structure. Thus, In these assays, dscFimH folded independently whereas 
FimH folded in the presence of, but not in the absence of, FimC. FimC was 
capable of binding to native FimH separated from the FimCH complex by 3 M 
urea, confirming that the chaperone can indeed bind to folded subunits. As 
further evidence, a mutation in Arg 8 of FimC (see Figure 2), a residue critical 

15 in chaperone complex formation, abolished the ability of the mutant protein 
to bind to native FimH or to facilitate re-folding of denatured RmH. 



In sum, these results show that the chaperone is necessary for 
subunit folding by providing the subunit protein with missing steric 

20 information. Here, the information required for subunit folding resides in two 
polypeptides. In such an arrangement, the C-terminal carboxyl group of the F 
strand of the subunit anchors to the conserved Arg8 and Lys112 residues in 
the chaperone cleft. Subsequent p-zippering along the 61 strand facilitates 
the formation of the initial F p-strand in the pilin, which in turn initiates p- 

25 sheet formation. These Interactions would position the strand-F hydrophobic 
side chains of the subunit in register with the G1 strand alternating 
hydrophobic residues of the chaperone so as to facilitate the proper collapse 
of the hydrophobic core of the subunit. In dscFimH, the steric information 
normally provided by the chaperone is now present in a single polypeptide 

30 chain, provided by the sequence corresponding to the N-terminus of FimG. 
The missing sequence provided in cis enables the pilin domain of FimH to 
fold into a perfect canonical Ig fold, mimicking the fold that is otherwise 
completed by FimG in the tip ftbrillum. 
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As already stated, FimH based vaccines have been shown to protect 
mice and monkeys from experimental bladder infections (see: Langcrmann et 
al. Science 276, 607 (1997); Langermann et al, J. Infect. Dfs. in press). 
However, the production of such vaccines required co-expression of FimC 
5 and the purification of a FimCH complex. In accordance with the invention 
disclosed herein, donor strand complementation now permits production of 
dscFimH vaccines. These vaccines are advantageous in that they comprise a 
single subunit and are expected to be more stable due to the anti-parallel 
donor strand. Thus, vaccination with dscFimH has been found to produce 
10 anti-FimH titers that are comparable to those achieved with the FimCH 
complex. Thus, the methods taught by the present invention readily facilitate 
the production of adhesin-based vaccines generally. 



15 



Example 3 
dsc-lmmunogen Dosages 

This experiment evaluated dosages of the dsc immunogen (dose 
20 down) at 15 \ig, 3 pg, and 0.6 pg. CFA/IFA was used as an adjuvant. Here, 
30, 6 and 1.2 pg FimCH was used for comparison. It was expected that the 
ratio between the dsc and FimCH would be similar since the amount of FlmH 
at each dose of FimCH complex (1:1 ratio of FlmH to FimC) is equivalent to 
the amount present in each dose of dscFimH. Endpoint titers using anti- 
25 FimH T3 and Anti-FimdscFimH as detecting antigens were used to illustrate 
this correlation, evaluate immune responses and demonstrate comparability. 
With the exception of the 0.6 pg dose of the dsc immunogen (vs. 1 .2 pg 
FimCH), end point titers appeared similar. Results are tabulated in Table 2 as 
endpoint dilutions using ELISA. Each group was done in duplicate. Note for 
30 example that Group 1 and Group 7 differ in the amount of pilin used as 
immunogen. This is because Group 7 received 30 pg FimCH, which is the 
complex of FimC and FimH, whereas Group 1 received 15 pg dscFimH, 
which is FimH with only the donor strand segment at the end. Thus, by 
doubling the amount of the FimCH complex a more closely related molar 
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quantity was achieved since the molecular weight of the complex is almost 
twice that of the corresponding donor strand complemented material. This 
demonstrates the immunogenicity of the dsc protein versus the complex 
pilin-chaperone complex. 



Table 2 



Anti-FimH T3 Anti-FimH dsc 



Group 


Immunogen 


3 week 


8 week 


3 week 


8 week 














1 


1 5 ng FimH dsc 


204800 


1638400 


204800 


1638400 


2 


1 5 M FimH dsc 


102400 


409600 


204800 


1638400 


3 


3 fig FimH dsc 


102400 


819200 


102400 


1638400 


4 


3 ^g FimH dsc 


51200 


409600 


51200 


819200 


5 


0.6 (ig FimH dsc 


3200 


25600 


12800 


409600 


6 


0.6 ng FimH dsc 


400 


25600 


6400 


409600 


7 


30 fig FimCH 


204800 


1 638400 


204800 


1638400 


8 


30 ng FimCH 


204800 


1638400 


204800 


1638400 


9 


CFA/IFA 


100 


100 


100 


100 


10 


CFA/IFA 


100 


100 


100 


100 


11 


No Injection 


100 


100 


100 


100 


12 


No Injection 


100 


100 


100 


100 


13 


6 pg FimCH 


102400 


1638400 


102400 


1638400 


14 


6 |ig FimCH 


102400 


819200 


102400 


1638400 


15 


1 .2 fig FimCH 


51200 


819200 


51200 


1638400 


16 


1 .2 iig FimCH 


51200 


819200 


51200 


1638400 



The results of challenge with the NU 14 strain of £. coli, administered 
15 intraurethrally^ is shown in Figure 10. The results indicate protection by the 
donor strand complemented protein (dscFimH). Here, the dsc antigen was 
compared to control (CFA/IFA - complete Freund's adjuvant/Incomplete 
Freund's adjuvant) and naive (no adjuvant or antigen) as well as to high dose 
FimCH (CFA/IFA). Significant protection was observed with the dscFimH 
20 when compared to either the control or naive groups, at least at the 3 |ig and 
1 5 ^g doses. 
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Example 4 



Immunogentctty of DSC Protein 



5 



In these experiments, MF 59 (Chiron Corp.) was used as adjuvant and 
monitoring was continued over an 18 week period. Overall, the responses 
were comparable between the HmCH and dscFimH at all doses. Endpoint 
titers are shown in Table 3 as well as in the graphs of Figure 11. In the 
10 challenge experiment (results depicted in Figure 12), better protection was 
observed at the lower versus the higher doses (which could be due to the 
use of the MF 59 adjuvant). The 0.6 \ig dose appeared to be the best. 
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Table 3. 



Group 


Immunogen 


A 


n 
D 


U ■ 


r\ 
U 








3 week 


8 week 


1 2 weeK 


1 o weeK 


lo weeK 


1 


1 5 FimH (JSC 


1 2ouO 


Of\AOf\f\ 

2U4oUU 


2U4cSUv/ 


D 1 ^uu 




2 


1 5 |iG rimrl dSC 


2DD0U 


4U9DUU 


4U^DUU 


IUZ4UW 




3 


3 nmii dsc 


oOO 


2u4oUO 


204oUO 


1 024Uu 


4UyDUU 


4 


3 mG FimH dsc 


6400 


204800 


204800 


1 O240O 


409 BOQ 


5 


0.6 (igFimH dsc 


100 


12800 


12800 


6400 


204800 


6 


0.6(iG FimH dsc 


100 


51200 


51200 


12800 


409600 


7 


30 lag FimCH 


25600 


409600 


409600 


204800 


409600 


8 


30 |ig FimCH 


25600 


409600 


204800 


102400 


409600 


9 


MF 59 


100 


100 


100 


100 


100 


10 


MF59 


100 


100 


TOO 


lOO 




11 


Naive 


100 


100 


100 


100 


100 


12 


Naive 


100 


100 


100 


100 


200 


13 


6 \iQ FimCH 


25600 


409600 


409600 


102400 


409600 


14 


6 |ig FimCH 


25600 


409600 


409600 


204800 


819200 


15 


1.2 ng FimCH 


100 


25600 


25600 


25600 


204800 


16 


1 .2 ^g FimCH 


100 


12800 


6400 


6400 


102400 



5 

Example 5 
Low Dose dscFimH 

Further experiments were performed to show that acceptable 
10 responses could be achieved at lower doses. Immunogenicity was 
determined (using MF59 as adjuvant) at several lower doses. The results 
are presented in Tables 4, 5 and 6. In general, even at a dose as low as 
0.32 pg (following a boost) or 2 ^g (pre-boost) good responses were 
observed. For this experiment, C3H/HeJ mice were immunized on day 0 
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and boosted at week 4 (5 mice/group). EPTs (end point titers) represent 
titers for pooled sera for each group. Immune responses were measured 
pre- and post-boost against three different capture antigens {by EUSA): 
FimH T3 (a FimH truncate), dscRmH and FimH (0.4 M Urea), 



Table 4 



Immunogen (dose) 


a-FimH T3 (3 weeks) 


a-FimH T3 (8 weeks) 


FimHdsc (2 ^g) 


1600 


102400 


FimHdsc (0.8 (xg) 


<100 


51200 


FimHdsc (0.32 (tg) 


<100 


12800 


FimCH (4 fig) 


51200 


204800 


FimCH (0.26 ftg) 


6400 


409600 


Table 5 


Immunogen (dose) 


a-FimH T3 (3 weeks) 


a-RmHdsc (8 weeks) 


FimHdsc (2 tig) 


6400 


204800 


FimHdsc (0.8 \ig) 


200 


204800 


RmHdsc (0.32 |ig) 


<100 


51200 


FimCH (4 ftg) 


51200 


409600 


FimCH (0.26 fig) 


6400 


409600 



15 



20 
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Table 6 



Immunogen (dose) 


a-FimH T3 (3 wks) 


a-FimH (.4M Urea) (8 wks) 


FimHdsc (2 fig) 


800 


51200 


FimHdsc (0.8 ^g) 


<100 


102400 


FimHdsc (0.32 ^g) 


<100 


12800 


FimCH (4 pg) 


25600 


204800 


FimCH (0.26 |ig) 


1600 


102400 



5 



10 



15 



20 
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WHAT IS CLAIMED iS : 

1. An immunogenic complex comprising a pilus-protein component 
and a donor strand component. 

5 

2. The Immunogenic complex of claim 1 wherein said pilus-protein 
component is covalently bound to said donor strand component. 

3. The immunogenic complex of claim 1 wherein said pilus-protein 
10 component is non>covalently bound to said donor strand component. 

4. The Immunogenic complex of claim 1 wherein said piius protein is 
an adhesin. 

15 5. The immunogenic complex of claim 4 wherein said adhesin Is FimH. 

6. The Immunogenic complex of claim 1 wherein the donor strand is 
selected from the group consisting of SEQ ID NOs: 3, 4, 5, 6, 1, and 8. 

20 7. A polypeptide comprising a pilus-protein portion and a donor 

complement portion as part of the same amino acid sequence. 

8. The polypeptide of claim 7 wherein the donor complement portion is 
derived from a bacterial chaperone. 

25 

9. The polypeptide of claim 7 wherein the donor complement portion is 
a bacterial pilin or adhesin, a portion thereof, or derived from a bacterial pilin 
or adhesin. 

30 10. The polypeptide of claim 8 wherein the chaperone Is selected from 

the group consisting of PapD and FimC. 

11. The polypeptide of claim 8 wherein the bacterial chaperone is 
derived from £. co//. 

35 
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12. The polypeptide of claim 7 wherein the donor complement portion 
is selected from the group consisting of SEQ ID NOS: 3, 4, 5, 6, 7, and 8. 

13. The polypeptide of claim 7 wherein the pilus-protein portion is 
5 found in bacteria of the family enterobacteriaceae. 

14. The polypeptide of claim 13 wherein the bacterium is E colL 

15. The polypeptide of claim 7 wherein the pllus-protein is an adhesin. 

10 

1 6. The polypeptide of claim 7 wherein the pilus-protein is a pilin. 

17. The polypeptide of claim 7 wherein the pilus protein is PapK. 

15 18. The polypeptide of claim 15 wherein the adhesin is FlmH. 

19. The polypeptide of claim 18 having a sequence selected from the 
group consisting of SEQ ID NO: 10 and 1 1 . 

20 20. A single polypeptide or polypeptide complex comprising the 

polypeptide of claim 7 and an adhesin. 

21. A single polypeptide or polypeptide complex comprising the 
polypeptide of claim 7 and a pilin. 

25 

22. The polypeptide complex of claim 20 and 21 wherein the pilus- 
protein portion of the polypeptide is FimG and the adhesin is FimH. 

23. A polynucleotide comprising a coding region for the polypeptide of 
30 claim 7, 20 or 21. 

24. An antibody specific for an immunogenic complex selected from 
the group consisting of the polypeptides and/or complexes of claims 1,2,3, 
4, 5, and 6. 
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25. An antibody specific for a polypeptide selected from the group 
consisting of the polypeptides of claims 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 
17. 18, 19, 20, 21, and 22. 



10 



15 



5- 

26. The antibody of claim 24 or 25 wherein said antibody is a 
monoclonal antibody. 

27. A genetically engineered cell expressing the antibody of claim 26. 

28. A genetically engineered cell expressing the polynucleotide of 
claim 23. 

29. A vector comprising the polynucleotide of claim 23. 

30. A genetically engineered cell expressing the polypeptide of claim 

19. 

31. A composition comprising the antibody of claim 24 or 25 
20 suspended in a pharmacologically acceptable carrier, diluent or excipient. 

32. A composition comprising an immunogenic complex selected from 
the group consisting of the complexes of claims 1, 2, 3, 4, 5, and 6, said 
immunogenic complex being suspended in a pharmacologically acceptable 

25 carrier, diluent or excipient. 

33. A composition comprising a polypeptide selected from the group 
consisting of the polypeptides of claims 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 
17, 18, 19, 20, 21, and 22, said polypeptide being suspended in a 
pharmacologically acceptable carrier, diluent or excipient. 

30 

34. A vaccine composition comprising an immunogenically effective 
amount of an immunogenic complex selected from the group consisting of 
the complexes of claims 1, 2, 3, 4, 5, and 6, said immunogenic complex 
being suspended in a pharmacologically acceptable carrier, diluent or 
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excipient. 

35. A vaccine composition comprising an immunogenically effective 
amount of a polypeptide selected from the group consisting of the 

5 polypeptides of claims 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 
21, and 22, said polypeptide being suspended in a pharmacologically 
acceptable carrier, diluent or excipient. 

36. A method of preventing a disease in a mammal at risk thereof 
10 comprising administering to said animal the vaccine composition of claim 34 

or 35- 

37. The method of claim 36 wherein the disease is caused by a 
bacterium of the family enterobacteriaceae. 

15 

38. The method of claim 37 wherein the bacterium is £. colL 

39. The method of claim 30 wherein the mammal is a human. 

20 40. The method of claim 36 wherein the disease is a urinary tract 

disease. 

41. A method of treating a disease in a mammal afflicted therewith 
comprising administering to said animal a pharmacologically effective amount 

25 of the composition of claim 31 . 

42. The method of claim 41 wherein the disease is caused by a 
bacterium of the family enterobacteriaceae, 

30 43. The method of claim 42 wherein the bacterium is £. coli. 

44. The method of claim 41 wherein the animal is a human. 

45. The method of claim 44 wherein the disease is a urinary tract 
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infection. 

46. The method of claim 41 wherein the disease Is caused by a 
bacterium of the family Enterobacteriaceae. 

5 

47. The method of claim 46 wherein the disease is caused by f. colL 

48. A method of producing an immunogenic polypeptide comprising 
synthesizing a pilus-polypeptide having a donor complement strand attached 

10 thereto. 

49. The method of claim 48 wherein the donor complement strand is 
selected from the group consisting of SEQ ID NOS: 3, 4, 5, 6, 7, and 8. 

15 50. The method of claim 48 wherein the pilus-polypeptide is an 

adhesin. 

51. The method of claim 50 wherein the adhesin is FimH. 
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SEQUENCE LISTING 

<110> Hultgren, Scott J. 
•Pinknerr Jerome S. 
Sauer, Frederic 
Bamhart, Michelle 
WsJcsman, Gabriel 
Knight, Stefan 

<120> Donor Strand Conqplemented Pilin and Adhesin Broad-Based 
Vaccines 

cX30> 469201-479 

<140> 
<141> 

<150> U.S. 60/143,582 
<151> 1999-07-13 

<150> 0.S. 60/144,359 
<151> 1999-07-16 

<150> U.S. 60/184,442 
<151> 2000-02-23 

<160> 23 

<170> Patent In Ver. 2.1 

c210> 1 
<211> 279 
<212> PRT 

<213> Escherichia coll 
<400> 1 

Phe Ala Cys Lys Thr Ala Asn Gly Thr Ala lie Pro lie Gly Gly Gly 
15 10 15 

Ser Ala Asn Val Tyr Val Asn Leu Ala Pro Val Val Asn Val Gly Gin 
20 25 30 

Asn Leu Val Val Asp Leu Ser Thr Gin lie Phe Cys His Asn Asp Tyr 

35 40 45 

Pro Glu Thr lie Thr Asp Tyr Val Thr Leu Gin Arg Gly Ser Ala Tyr 
50 55 60 
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Gly Gly Val Leu Ser Asn Phe Ser Gly Thr Val Lys Tyr Ser Gly Ser 
«5 70 75 80 

Ser Tyr Pro Phe Pro Thr Thr Ser Glu Thr Pro Arg Val Val Tyr Asn 
85 90 55 

Ser Arg Thr Asp Lys Pro Trp Pro Val Ala Leu Tyr Leu Thr Pro Val 
100 105 110 

Ser Ser Ala Gly Gly Val Ala He Lys Ala Gly Ser Leii He Ala Val 
115 120 125 

Leu rie Leu Arg Gin Thr Asn Asn Tyr Asn Ser Asp Asp Phe Gin Phe 
130 135 140 

Val Trp Asn He Tyr Ala Asn Asn Asp Val Val Val Pro Thr Gly Gly 
150 155 160 

Cys Asp Val Ser Ala Arg Asn Val Thr Val Thr Leu Pro Asp Tyr Pro 
1^5 170 175 

Gly Ser Val Pro He Pro Leu Thr Val Tyr Cys Ala Lys Ser Gin Asn 
180 185 190 

Leu Gly Tyr Tyr Leu Ser Gly Thr Thr Ala Asp Ala Gly Asn Ser He 
155 200 205 

Phe Thr Asn Thr Ala Ser Phe Ser Pro Ala Gin Gly Val Gly Val Gin 
210 215 220- 

Leu Thr Arg Asn Gly Thr He He Pro Ala Asn Asn Thr Val Ser Leu 
225 230 235 240 

Gly Ala Val Gly Thr Ser Ala Val Ser Leu Gly Leu Thr Ala Asn Tyr 
245 250 255 

Ala Arg Thr Gly Gly Gin Val Thr Ala Gly Asn Val Gin Ser He He 
2€0 265 270 

Gly Val Thr Phe Val Tyr Gin 
275 



<210> 2 
<211> 205 
<212> PRT 

<213> Escherichia coli 
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<400> 2 

Gly Val Ala l*eu Gly Ala Thr Arg Val He Tyr Pro Ala Gly Gin Lys 
IS 10 15 

Gin val Gin Leu Ala Val Thr Asn Asn Asp Glu Asn Ser Thr Tyr Leu 
20 25 30 

He Gin Ser Trp Val Glu Asn Ala Asp Gly Val Lys Asp Gly Arg Phe 
35 40 45 

He Val Thr Pro Pro Leu Phe Ala Met Lys Gly. Lys Lys Glu Asn Thr 
50 55 60 

Leu Arg He Leu Asp Ala Thr Asn Asn Gin Leu Pro Gin Asp Arg Glu 
65 70 75 80 

Ser Leu Phe Trp Met Asn Val Lys Ala He Pro Ser Net Asp Lys Ser 
85 90 95 

Lys Leu Thr Glu Asn Thr Leu Gin Leu Ala He He Ser Arg He Lys 
100 105 110 

Leu Tyr Tyr Arg Pro Ala Lys Leu Ala Leu Pro Pro Asp Gin Ala Ala 
115 120 125 

Glu Lys Leu Arg Phe Arg Arg Ser Ala Asn Ser Leu Thr Leu He Asn 
130 135 140 

Pro Thr Pro Tyr Tyr Leu Thr Val Thr Glu Leu Asn Ala Gly Thr Arg 
145 150 155 160 

Val Leu Glu Asn Ala Leu Val Pro Pro Met Gly Glu Ser Ala Val Lys 
165 170 175 

Leu Pro Ser Asp Ala Gly Ser Asn He Thr Tyr Arg Thr He Asn Asp 
180 185 190 

Tyr Gly Ala Leu Thr Pro Lys Met Thr Gly Val Met Glu 
195 200 205 



<210> 3 
<211> 15 
<212> PRT 

<213> Escherichia coll 
<400> 3 

Thr Leu Gin Leu Ala He He Ser Arg He Lys Leu Tyr Tyr Arg 
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1 



5 



10 



15 



<210> 4 
<211> 15 
<212> PRT 

<213> Artificial Sequence 
<220> 

<22a> Description of Artificial Sequence: Reverse of the 
Gl strand of FimC shown in SEQ ID NO: 3. 

<400> 4 

Arg Tyr Tyr Leu Lys lie Arg Ser He He Ala Leu Gin Leu Thr 
15 10 15 



<210> 5 
<211> 14 

<212> PRT 

<213> Escherichia coli 
<400> 5 

Asp Val Thr He Thr Val Asn Gly Lys Val Val Ala Lys Pro 
15 10 



<210> 6 
<211> 14 
<212> PRT 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: Reverse of the 
PapG donor strand shown in SEQ ID NO: 5. 

<400> 6 

Pro Lys Ala Val Val Lys Gly Asn Val Thr He Thr Val Asp 
15 10 



<210> 7 
<211> 13 
<212> PRT 

<213> Escherichia coli 
<400> 7 

Asp Val Thr He Thr Val Asn Gly Lys Val Val Ala Lys 
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15 10 



<210> B 

<211> 13 
<212> PRT 

<213> Escherichia coli 
<400> 8 

Asp Ser Thr He Thr He Arg Gly Tyr Val Arg Asp Asn 
1 5 10 



<210> 9 
<211> 143 
<212> PRT 

<213> Escherichia coli 
<400> 9 

Asp Val Thr He Thr Val Asn Gly Lys Val Val Ala I*ys Pro Cys Thr 
15 10 15 

Val Ser Thr Thr Asn Ala Thr Val Asp Leu Gly Asp Leu Tyr Ser Phe 
20 25 30 

Ser Leu Met Ser Ala Gly Ala Ala Ser Ala Trp His Asp Val Ala Leu 
35 40 45 

Glu Leu Thr Asn Cys Pro Val Gly Thr Ser Arg Val Thr Ala Ser Phe 
50 55 SO 

Ser Gly Ala Ala Asp Ser Thr Gly Tyr Tyr Lys Asn Gin Gly Thr Ala 
65 70 75 80 

Gin Asn He Gin Leu Glu Leu Gin Asp Asp Ser Gly Asn Thr Leu Asn 
85 90 95 

Thr Gly Ala Thr Lys Thr Val Gin Val Asp Asp Ser Ser Gin Ser Ala 
100 105 110 

His Phe Pro Leu Gin Val Arg Ala Leu Thr Val Asn Gly Gly Ala Thr 
115 120 125 

Gin Gly Thr He Gin Ala Val He Ser He Thr Tyr Thr Tyr Ser 
130 135 140 



<210> 10 
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<211> 304 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :D8cFiinH with Gl 
strand of FimC and linker 

<400> 10 

Phe Ala Cys Lys Thr Ala Asn Gly Thr Ala lie Pro lie Gly Gly Gly 
1 5 10 15 

Ser Ala Asn Val Tyr Val Asn Leu Ala Pro Val Val Asn Val Gly Gin 
20 25 30 

Asn Leu Val Val Asp Leu Ser Thr Gin lie Phe Cys His Asn Asp Tyr 
35 40 45 

Pro Glu Thr lie Thr Asp Tyr Val Thr Leu Gin Arg Gly S.er Ala Tyr 
50 55 60 

Gly Gly Val Leu Ser Asn Phe Ser Gly Thr Val Lys Tyr Ser Gly Ser 
65 70 75 80 

Ser Tyr Pro Phe Pro Thr Thr Ser Glu Thr Pro Arg Val Val Tyr Asn 
85 90 95 

Ser Arg Thr Asp Lys Pro Trp Pro Val Ala Leu Tyr Leu Thr Pro Val 
100 105 110 

Ser Ser Ala Gly Gly Val Ala lie Lys Ala Gly Ser Leu lie Ala Val 
115 120 125 

Leu lie Leu Arg Gin Thr Asn Asn Tyr Asn Ser Asp Asp Phe Gin Phe 
130 135 140 

Val Trp Asn lie Tyr Ala Asn Asn Asp Val Val Val Pro Thr Gly Gly 
145 150 155 160 

Cys Asp Val Ser Ala Arg Asn Val Thr Val Thr Leu Pro Asp Tyr Pro 
165 170 175 

Gly Ser Val Pro lie Pro Leu Thr Val Tyr Cys Ala Lys Ser Gin Asn 
180 185 190 

Leu Gly Tyr Tyr Leu Ser Gly Thr Thr Ala Asp Ala Gly Asn Ser lie 
195 200 205 
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Phe Thr Asn Thr Ala Ser Phe Ser Pro Ala Gin Gly Val Gly Val Gin 
210 215 220 

Leu Thr Arg Asn Gly Thr lie lie Pro Ala Asn Asn Thr Val Ser Leu 
225 230 235 240 

Gly Ala Val Gly Thr Ser Ala Val Ser Leu Gly Leu Thr Ala Asn Tyr 
245 250 255 

Ala Arg Thr Gly Gly Gin Val Thr Ala Gly Asn Val Gin Ser lie lie 
• 260 265 270 

Gly Val Thr Phe Val Tyr Gin Gly Ser Gly Ser Gly Ser Gly Ser Gly 
275 280 285 

Ser Thr Leu Gin Leu Ala lie lie Ser Arg lie Lys Leu Tyr Tyr Arg 
290 295 300 



<210> 11 
<211> 296 
c212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence :DscFimH with 
donor strand from FiinG and linker 

<400> 11 

Phe Ala Cys Lys Thr Ala Asn Gly Thr Ala lie Pro lie Gly Gly Gly 
15 10 15 

Ser Ala Asn Val Tyr Val Asn Leu Ala Pro Val Val Asn Val Gly Gin 
20 25 30 

Asn Leu Val Val Asp Leu Ser Thr Gin lie Phe Cys His Asn Asp Tyr 
35 40 45 

Pro Glu Thr He Thr Asp Tyr Val Thr Leu Gin Arg Gly Ser Ala Tyr 
50 55 60 

Gly Gly Val Leu . Ser Asn Phe Ser Gly Thr Val Lys Tyr Ser Gly Ser 
€5 70 75 80 

Ser Tyr Pro Phe Pro Thr Thr Ser Glu Thr Pro Arg Val Val Tyr Asn 
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85 90 95 

Ser Arg Thr Asp Lys Pro Trp Pro Val Ala Leu Tyr Leu Thr Pro Val 
100 105 110 

Ser Ser Ala Gly Gly Val Ala lie Lys Ala Gly Ser Leu He Ala Val 
115 120 125 

Leu He Leu Arg Gin Thr Asn Asn Tyr Asn Ser Asp Asp Phe Gin Phe 
130 135 140 

Val Trp Asn He Tyr Ala Asn Asn Asp Val Val Val Pro Thr Gly Gly 
145 150 155 160 

Cys Asp Val Ser Ala Arg Asn Val Thr Val Thr Leu Pro Asp Tyr Pro 
165 170 175 

Gly Ser Val Pro He Pro Leu Thr Val Tyr Cys Ala Lys Ser Gin Asn 
180 185 190 

Leu Gly Tyr Tyr Leu Ser Gly Thr Thr Ala Asp Ala Gly Asn Ser He 
195 200 205 

Phe Thr Asn Thr Ala Ser Phe Ser Pro Ala Gin Gly Val Gly Val Gin 
210 215 220 

Leu Thr Arg Asn Gly Thr He He Pro Ala Asn Asn Thr Val Ser Leu 
225 230 235 240 

Gly Ala Val Gly Thr Ser Ala Val Ser Leu Gly Leu Thr Ala Asn Tyr 
245 250 255 

Ala Arg Thr Gly Gly Gin Val Thr Ala Gly Asn Val Gin Ser He He 
260 265 270 

Gly Val Thr Phe Val Tyr Gin Asp Asn Lys Gin Asp Val Thr He Thr 
275 280 285 

Val Asn Gly Lys Val Val Ala Lys 
290 295 



<210> 12 
<211> 153 
<212> PRT 

<213> Escherichia coli 
<400> 12 
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Asp Ser Thr He Thr He Arg Gly Tyr Val Val Aen Ala Ala Cys Ala 
15 10 15 

Val Asp Ala Gly Ser Val Asp Gin Thr val Gin Leu Gly Gin Val Arg 
20 25 30 

Thr Ala Thr Leu Lys Gin Ala Gly Ala Thr Ser Ser Ala Val Gly Phe 
35 40 45 

Asn lie Gin Leu Asn Asp Cys Asp Thr Thr Val Ala Thr Lys Ala Ala 
50 55 60 

Val Ala Phe Leu Gly Thr Ala He Asp Ser Thr His Pro Lys Val Leu 
^5 70 75 80 

Ala Leu Gin Ser Ser Ala Ala Gly Ser Ala Thr Asn Val Gly Val Gin 
85 90 95 

He Leu Asp Arg Thr Gly Asn Glu Leu Thr Leu Asp Gly Ala Thr Phe 
100 105 110 

Ser Ala Glu Thr Thr Leu Asn Asn Gly Thr Asn Thr He Pro Phe Gin 

115 120 125 

Ala Arg Tyr Phe Ala Thr Gly Ala Ala Thr Pro Gly Ala Ala Asn Ala 
130 135 140 

Asp Ala Thr Phe Lys Val Gin Tyr Gin 
145 150 



<210> 13 
<21X> 152 
<212> PRT 

<213> Escherichia coli 
<400> 13 

Asp Ser Thr He Thr He Arg Gly Tyr Val Arg Asp Asn Gly Cys Ser 
^5 10 15 

Val Ala Ala Glu Ser Thr Asn Phe Thr Val Asp Leu Met Glu Asn Ala 
20 25 30 

Ala Lys Gin Phe Asn Asn He Gly Ala Thr Thr Pro Val Val Pro Phe 
35 40 45 

Arg He Leu Leu Ser Pro Cys Gly Asn Ala Val Ser Ala Val Lys Val 
50 55 60 
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Gly Phe Leu Gly Val Ala Asp Ser 
65 .70 

Glu Asn Thr Val Ser Ala Ala Ser 

85 

Glu Gin Asn Gin lie Pro I>eu Asn 
100 

Thr Thr Leu Thr Pro Gly Lys Pro 
115 120 

Leu Met Ala Thr Gin Val Pro Val 
130 135 

Ala Thr Phe Thr Leu Glu Tyr Gin 
145 150 



His Asn Ala Asn Leu Leu Ala Leu 
75 80 

Gly Leu Gly lie Gin Leu Leu Asn 
90 95 

Ala Pro Ser Ser Ala Leu Ser Trp 
105 110 

Asn Thr Leu Asn Phe Tyr Ala Arg 
125 

Thr Ala Gly His lie Asn Ala Thr 
140 



<210> 14 
<211> 218 
<212> PRT 

<213> Escherichia coli 

<400> 14 

Ala Val Ser Leu Asp Arg Thr Arg Ala Val Phe Asp Gly Ser Glu Lys 
15 10 15 

Ser Net Thr Leu Asp lie Ser Asn Asp Asn Lys Gin Leu Pro Tyr Leu 
20 25 30 

Ala Gin Ala Trp lie Glu Asn Glu Asn Gin Glu Lys lie lie Thr Gly 
35 40 45 

Pro Val lie Ala Thr Pro Pro Val Gin Arg Leu Glu Pro Gly Ala Lys 
50 55 60 

Ser Met Val Arg Leu Ser Thr Thr Pro Asp lie Ser Lys Leu Pro Gin 
65 70 75 80 

Asp Arg Glu Ser Leu Phe Tyr Phe Asn Leu Arg Glu lie Pro Pro Arg 
85 90 95 

Ser Glu Lys Ala Asn Val Leu Gin lie Ala Leu Gin Thr Lys lie Lys 
100 105 110 

Leu Phe Tyr Arg Pro Ala Ala lie Lys Thr Arg Pro Asn Glu Val Trp 
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115 120 . 125 

Gin Asp Gin Leu He Leu Asn Lys Val Ser Gly Gly Tyr Arg He Glu 
130 135 140 

Asn Pro Thr Pro Tyr Tyr Val Thr Val He Gly Leu Gly Gly Ser Glu 
145 150 155 160 

Lys Gin Ala Glu Glu Gly Glu Phe Glu Thr Val Net Leu Ser Pro Arg 
165 170 175 

Ser Glu Gin Thr Val Lys Ser Ala Asn Tyr Asn Thr Pro Tyr Leu Ser 
180 185 190 

Tyr He Asn Asp Tyr Gly Gly Arg Pro Val Leu Ser Phe He Cys Asn 
195 200 205 

Gly Ser Arg Cys Ser Val Lys Lys Glu Lys 
210 215 



<210> 15 

<211> 157 

<212> PRT 

<213> Escherichia coli 

<400> 15 

Ser Asp Val Ala Phe Arg Gly Asn Leu X#eu Asp Arg Pro Cys His Val 
1 5 10 .15 

Ser Gly Asp Ser Leu Asn Lys His Val Val Phe Lys Thr Arg Ala Ser 
20 25 30 

Arg Asp Phe Trp Tyr Pro Pro Gly Arg Ser Pro Thr Glu Ser Phe Val 
35 40 45 

He Arg Leu Glu Asn Cys His Ala Thr Ala Val Gly Lys He Val Thr 
50 55 60 

Leu Thr Phe Lys Gly Thr Glu Glu Ala Ala Leu Pro Gly His Leu Lys 
65 70 75 80 

Val Thr Gly Val Asn Ala Gly Arg Leu Gly He Ala Leu Leu Asp Thr 
85 90 95 

Asp Gly Ser Ser Leu Leu Lys Pro Gly Thr Ser His Asn Lys Gly Gin 
100 105 110 
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Gly Glu I*ys Val Thr Gly Asn Ser I*eu Glu Leu Pro Phe Gly Ala Tyr 
115 120 125 

Val Val TQa Thr Pro Glu Ala Leu Arg Thr Lys Ser Val Val Pro Gly 
130 135 140 

Asp Tyr Glu Ala Thr Ala Thr Phe Glu Leu Thr Tyr Arg 
145 150 155 



<210> 16 
<211> 163 
<212> PRT 

<213> Escherichia coli 
<400> 16 

Ala Pro Thr lie Pro Gin Gly Gin Gly Lys Val Thr Phe Asn Gly Thr 
15 10 15 

Val Val Asp Ala Pro Cys Ser He Ser Gin Lys Ser Ala Asp Gin Ser 
20 25 30 

He Asp Phe Gly Gin Leu Ser Lys Ser Phe Leu Glu Ala Gly. Gly Val 
35 40 45 

Ser Lys Pro Met Asp Leu Asp He Glu Leu Val Asn Cys Asp He Thr 
50 55 60 

Ala Phe Lys Gly Gly Asn Gly Ala Lys Lys Gly Thr Val Lys Leu Ala 
^5 70 75 80 

Phe Thr Gly Pro He Val Asn Gly His Ser Asp Glu Leu Asp Thr Asn 
85 90 95 

Gly Gly Thr Gly Thr Ala He Val Val Gin Gly Ala Gly Lys Asn Val 
100 105 110 

Val Phe Asp Gly Ser Glu Gly Asp Ala Asn Thr Leu Lys Asp Gly Glu 
115 120 125 

Asn Val Leu His Tyr Thr Ala Val Val Lys Lys Ser Ser Ala Val Gly 
130 135 140 

Ala Ala Val Thr Glu Gly Ala Phe Ser Ala Val Ala Asn Phe Asn Leu 
150 155 160 

Thr Tyr Gin 
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<210> 17 
<211> 148 
<212> PRT 

<213> Escherichia coli 
<400> 17 

Asp Asn Leu Thr Phe Arg Gly Lys Leu He He Pro Ala Cys Thr Val 
15 10 15 

Ser Asn Thr Thr Val Asp Trp Gin Asp Val Glu He Gin Thr Leu Ser 
20 25 30 

Gin Asn Gly Asn His Glu Lys Glu Phe Thr Val Asn Net Arg Cys Pro 
35 40 45 

Tyr Asn Leu Gly Thr Met Lys Val Thr He Thr Ala Thr Asn Thr Tyr 
50 55 60 

Asn Asn Ala He Leu Val Gin Asn Thr Ser Asn Thr Ser Ser Asp Gly 
65 70 75 80 

Leu Leu Val Tyr Leu Tyr Asn Ser Asn Ala Gly Asn He Gly Thr Ala 
85 90 95 

He Thr Leu Gly Thr Pro Phe Thr Pro Gly Lys He Thr Gly Asn Asn 
100 105 XIO 

Ala Asp Lys Thr He Ser Leu His Ala Lys Leu Gly Tyr Lys Gly Asn 
115 120 125 

Met Gin Asn Leu He Ala Gly Pro Phe Ser Ala Thr Ala Thr Leu Val 
130 135 140 

Ala Ser Tyr Ser 
145 



<210> 18 
<211> 148 
<212> PRT 

<213> Escherichia coli 
<400> 18 

Asp Val Gin He Asn He Arg Gly Asn Val Tyr He Pro Pro Cys Thr 
15 10 15 
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lie Asn Asn Gly Gin Asn Xle Val Val Asp Phe Gly Asn lie Asn Pro 
20 25 30 

61u His Val Asp Asn Ser Arg Gly Glu Val Thr tys Thr lie Ser lie 
35 40 45 

Ser Cys Pro Tyr Lys Ser Gly Ser Leu Trp lie Lys Val Thr Gly Asn 
50 55 60 

Thr Met Gly Gly Gly Gin Asn Asn Val Leu Ala Thr Asn lie Thr His 
€5 70 75 80 

Phe Gly lie Ala Leu Tyr Gin Gly Lys Gly Met Ser Thr Pro Leu lie 
85 90 95 

Leu Gly Asn Gly Ser Gly Asn Gly Tyr Gly Val Thr Ala Gly Leu Asp 
100 105 110 

Thr Ala Arg Ser Thr Phe Thr Phe Thr Ser Val Pro Phe Arg Asn Gly 
1X5 120 125 

Ser Gly lie Leu Asn Gly Gly Asp Phe Gin Thr Thr Ala Ser Met Ser 
130 135 140 

Met lie Tyr Asn 
145 



<210> 19 
<211> 337 
<212> PRT 

<213> Escherichia coli 
<400> 19 

Net Lys Lys Trp Phe Pro Ala Leu Leu Phe Ser Leu Cys Val Ser Gly 
15 10 15 

Glu Ser Ser Ala Trp Asn His Asn lie Val Phe Tyr Ser Leu Gly Asn 
20 25 30 

Val Asn Ser Tyr Gin Gly Gly Asn Val Val He Thr Gin Arg Pro Gin 
35 40 45 

Phe He Thr Ser Trp Arg Pro Gly He Ala Thr Val Thr Trp Asn Gin 
50 55 60 

Cys Asn Gly Pro Glu Phe Ala Asp Gly Ser Trp Ala Tyr Tyr Arg Glu 
65 70 75 80 
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Tyr lie Ala Trp Val Val Phe Pro Lys Lys Val Met Thr Gin Asn Gly 
85 . 90 95 

Tyr Pro Leu Phe lie Glu Val His Asn Lys Gly Ser Trp Ser Glu Glu 
100 105 110 

Asn Thr Gly Asp Asn Asp Ser Tyr Phe Phe Leu Lye Gly Tyr Lys Trp 
lis 120 125 

Asp Glu Arg Ala Phe Asp Ala Gly Asn Leu Cys Gin Lys Pro Gly Glu 
130 135 140 

Thr Thr Arg Leu Thr Glu Lys Phe Asp Asp lie lie Phe Lys Val Ala 
145 150 155 160 

Leu Pro Ala Asp Leu Pro Leu Gly Asp Tyr Ser Val Thr He Pro Tyr 
165 170 175 

Thr Ser Gly He Gin Arg His Phe Ala Ser Tyr Leu Gly Ala Arg Phe 
180 185 190 

Lys He Pro Tyr Asn Val Ala Lys Thr Leu Pro Arg Glu Asn Glu Met 
195 200 205 

Leu Phe Leu Phe Lys Asn He Gly Gly Cys Arg Pro Ser Ala Gin Ser 
210 215 220 

Leu Glu He Lys His Gly Asp Leu Ser He Asn Ser Ala Asn Asn His 
225 230 235 240 

Tyr Ala Ala Gin Thr Leu Ser Val Ser Cys Asp Val Pro Ala Asn He 
245 250 255 

Arg Phe Met Leu Leu Arg Asn Thr Thr Pro Thr Tyr Ser His Gly Lys 
260 265 270 

Lys Phe Ser Val Gly Leu Gly His Gly Tip Asp Ser He Val Ser Val 
275 280 285 

Asn Gly Val Asp Thr Gly Glu Thr Thr Met Arg Trp Tyr Lys Ala Gly 
290 295 300 

Thr Gin Asn Leu Thr He Gly Ser Arg Leu Tyr Gly Glu Ser Ser Lys 
305 310 315 320 

He Gin Pro Gly Val Leu Ser Gly Ser Ala Thr Leu Leu Met He Leu 
325 330 335 
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Pro 



<210> 20 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Donor strand 
derived from the first 13 residues of PapF 

<400> 20 

Asp Val Gin lie Asn He Arg Gly Asn Val Tyr He Pro 
15 10 



<210> 21 
<211> 354 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Donor strand 
coniplemented form of PapG (SBQ ID NO: 19) 

<400> 21 

Met Lys Lys Trp Phe Pro Ala l.eu Leu Phe Ser Leu Cys Val Ser Gly 
15 10 15 

Glu Ser Ser Ala Trp Asn His Asn He Val Phe Tyr Ser Leu Gly Asn 
20 25 30 

Val Asn Ser Tyr Gin Gly Gly Asn Val Val lie Thr Gin Arg Pro Gin 
35 40 45 

Phe He Thr Ser Trp Arg Pro Gly He Ala Thr Val Thr Trp Asn Gin 
50 55 60 

Cys Asn Gly Pro Glu Phe Ala Asp Gly Ser Trp Ala Tyr Tyr Arg Glu 
65 70 75 80 

Tyr He Ala Trp Val Val Phe Pro Lys Lys Val Met Thr Gin Asn Gly 
85 90 95 

Tyr Pro Leu Phe He Glu Val His Asn Lys Gly Ser Trp Ser Glu Glu 
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100 105 110 

Asn Thr Gly Asp Asn Asp Ser Tyr Phe Phe Leu Lys Gly Tyr Lys Trp 
115 120 125 

Asp Glu Arg Ala Phe Asp Ala Gly Asn I«eu Cys Gin Lys Pro Gly Glu 
130 135 140 

Thr Thr Arg Leu Thr Glu Lys Phe Asp Asp lie lie Phe Lys Val Ala 
145 150 155 160 

Leu Pro Ala Asp Leu Pro Leu Gly Asp Tyr Ser Val Thr lie Pro Tyr 
165 170 175 

Thr Ser Gly He Gin Arg His Phe Ala Ser Tyr Leu Gly Ala Arg Phe 
180 185 190 

Lys He Pro Tyr Asn Val Ala Lys Thr Leu Pro Arg Glu Asn Glu Met 
195 200 205 

Leu Phe Leu Phe Lys Asn He Gly Gly Cys Arg Pro Ser Ala Gin Ser 
210 215 220 

Leu Glu He Lys His Gly Asp Leu Ser He Asn Ser Ala Asn Asn His 
225 230 235 240 

Tyr Ala Ala Gin Thr Leu Ser Val Ser Cys Asp Val Pro Ala Asn He 
245 250 255. 

Arg Phe Met Leu Leu Arg Asn Thr Thr Pro Thr Tyr Ser His Gly Lys 
260 265 270 

Lys Phe Ser Val Gly Leu Gly His Gly Trp Asp Ser He Val Ser Val 
275 280 285 

Asn Gly Val Asp Thr Gly Glu Thr Thr Met Arg Trp Tyr Lys Ala Gly 
290 295 300 

Thr Gin Asn Leu Thr He Gly Ser Arg Leu Tyr Gly Glu Ser Ser Lys 
305 310 315 320 

He Gin Pro Gly Val Leu Ser Gly Ser Ala Thr Leu Leu Met He Leu 
325 330 335 

Pro Asp Asn Lys Gin Asp Val Gin He Asn He Arg Gly Asn Val Tyr 
340 345 350 

He Pro 
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<210> 22 
<211> 84 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description o£ Artificial Sequence: Oligonucleotide 
used in cloning 

<400> 22 

cgattattgg cgtgactttt gtttatcaag ataacaaaca ggatgtcacc atcacggtga 60 
acggtaaggt cgtcgccaaa taag 84 



<210> 23 
<211> 87 
<212> WA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
used in cloning 

<400> 23 

gatccttatt tggcgacgac cttaccgttc accgtgatgg tgaccatcct gtttgttatc 60 
ttgataaaca aaagtcacgc caataat 87 
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